SCIP: Survey Sample Size Meridith Blevins, MS Bryan Shepherd, PhD Vanderbilt University Department of Biostatistics Blevins and Shepherd (Vandy Biostats) Zambézia Survey 1 / 23
Purpose We want to monitor the effectiveness of SCIP inititatives in Zambézia province through administration of a baseline and 5 year follow-up survey. Blevins and Shepherd (Vandy Biostats) Zambézia Survey 2 / 23
Sampling Frame The sampling frame for the baseline survey may rely on the 2007 Mozambique Population and Housing Census. A list of Enumeration Areas (EA) covering the province will be made available with basic housing and population information and cartographic materials. The population for Zambézia is estimated at 3,794,489. This is 918,025 households. Divided into 9,073 EA. There are 155,202 urban households (1,458 EA). There are 762,823 rural households (7,615 EA). Blevins and Shepherd (Vandy Biostats) Zambézia Survey 3 / 23
Previous Mozambican Surveys Blevins and Shepherd (Vandy Biostats) Zambézia Survey 4 / 23
Previous Surveys Community access to health care (COACH) Three year project implemented in 14 districts in Zambézia. Aimed at improving the quality of health services for children under 5 years of age and women of reproductive age. Mozambique Inquérito Nacional de SIDA (INSIDA) First nationwide household based survey with HIV testing. Designed to provide representative results for: Cohorts: women and men 15-49, and children 6m-11. Regions: the country as a whole; urban and rural areas; three geographical areas and for each of the 11 provinces. Blevins and Shepherd (Vandy Biostats) Zambézia Survey 5 / 23
Previous Surveys, cont d Demographic and Health Survey (DHS 2003) Collect information on fertility, maternal and child health and socio-economic characteristics of Mozambique. Conducted on the basis of a nationally representative sample, region and area of residence. Survey of Indicators of Well-Being (QUIBB 2000/01). Living standards of households type of construction and type of fuel used, and ownership of selected durable goods number of members in households, percentage of literate adults the percentage of sick or disabled, malnourished children, etc.. Public access, usage and satisfaction. schools, health facilities, food markets, police stations, and public transportation infrastructure, sanitation, electricity and drinking water sources Blevins and Shepherd (Vandy Biostats) Zambézia Survey 6 / 23
Previous Surveys, cont d Integrated Labor Force Survey (LFS, 2004/05). Sent to households that aims to collect information on the workforce in Mozambique. Collected data to estimate the level of employment, unemployment and underemployment. The Household Survey (IAF, 2002/03) Interest in general characteristics of the household, living expenses and family self-consumption, durable goods, income, etc. The period of interview for each household family was one week. Blevins and Shepherd (Vandy Biostats) Zambézia Survey 7 / 23
Sample Size of Previous Surveys Year Survey Strata Sample Size 2008 COACH Zambézia (Baseline) 3221 women (15-49 yrs)/ 1481 women with children < 2 (actual) 2008 COACH Zambézia (Final) 1824 women (15-49 yrs)/ 1169 women with children < 2 (actual) 2008 INSIDA 21-10 provinces/ 6230 households 1 city urban/rural 2003 DHS 11-10 provinces/ 12,280 households 1 city 2000 QUIBB 21-10 provinces/ 14,500 households 1 city urban/rural 2004 LFS 11-10 provinces/ 17,800 households 1 city 2002 IAF 21-10 provinces/ 1 city urban/rural 8700 households Blevins and Shepherd (Vandy Biostats) Zambézia Survey 8 / 23
Two-Stage Cluster Sample Blevins and Shepherd (Vandy Biostats) Zambézia Survey 9 / 23
Two-Stage Cluster Sample Determining optimal sample size is critical because it requires a trade-off between the budget and the desired survey precision. We will determine cost ratio intracluster correlation (ICC) cluster size desired precision number of clusters to be selected Blevins and Shepherd (Vandy Biostats) Zambézia Survey 10 / 23
Cost Ratio, C 1 C 2 Cost of interviewing a cluster, C 1 Household listing cost Travel between clusters (village-to-village) Cost of interviewing an individual, C 2 Travel within cluster (house-to-house) The cost ratio varies depending on population density and infrastructure. Blevins and Shepherd (Vandy Biostats) Zambézia Survey 11 / 23
Intracluster Correlation ICC measures the similarity of the individuals on one survey characteristic within a cluster, δ. Indicator Total Urban Rural Medical care 0.13 0.16 0.16 Knowledge of contraception 0.11 0.11 0.14 Background or lifetime variables 0.08 0.06 0.07 Current use of contraception 0.03 0.05 0.04 Child health 0.04 0.03 0.04 Fertility 0.03 0.02 0.02 Current fertility intentions 0.02 0.02 0.02 Infant mortality 0.02 0.01 0.02 Total average 0.055 0.06 0.06 Blevins and Shepherd (Vandy Biostats) Zambézia Survey 12 / 23
Cluster Size The number of individuals interviewed is a function of the cost ratio and intracluster correlation (ICC). The optimal number of households to sample: n = [ C1 1 δ ] 1/2 C 2 δ Blevins and Shepherd (Vandy Biostats) Zambézia Survey 13 / 23
Number of Clusters The design effect quantifies the increase in the standard error of the estimate due to the sampling procedure used. D = 1 + δ( n 1). Using optimal cluster size n, the number of clusters m to be selected within strata may be determined: m = p(1 p)d s 2 n For a confidence interval of ±5% we shall need s = 0.025. If we have some idea of the proportion p in advance, then it may be used. Blevins and Shepherd (Vandy Biostats) Zambézia Survey 14 / 23
Sampling Frame, revisited The population for Zambézia is estimated at 3,794,489. This is 918,025 households. Divided into 9,073 EA. There are 155,202 urban households (1,458 EA). There are 762,823 rural households (7,615 EA). Blevins and Shepherd (Vandy Biostats) Zambézia Survey 15 / 23
Sample Size Scenarios C 1 /C 2 ICC Precision (%) n m Per 8 Strata Stratum 20 0.01 5 45 13 585 4680 20 0.01 10 45 4 180 1440 20 0.02 5 32 21 672 5376 20 0.02 10 32 6 192 1536 20 0.05 5 20 39 780 6240 20 0.05 10 20 10 200 1600 20 0.1 5 14 66 924 7392 20 0.1 10 14 17 238 1904 20 0.15 5 11 91 1001 8008 20 0.15 10 11 23 253 2024 20 0.2 5 9 116 1044 8352 20 0.2 10 9 29 261 2088 Blevins and Shepherd (Vandy Biostats) Zambézia Survey 16 / 23
Estimating Prevalence of Underweight Children DHS 2003 reports that the prevalence of underweight children (weight-for-age < -2 SD) in Zambézia among children under age 5 is 26.9%. The same survey estimated that 17.4% of the country s population is aged 5 or under. The weight-for-age ICC given by historical DHS surveys is δ = 0.04. Blevins and Shepherd (Vandy Biostats) Zambézia Survey 17 / 23
Underweight Children, cont d It might be reasonable to assume that the prevalence of underweight children is 25% in 2010 and that for every 100 households surveyed, we will collect weight-for-age measures on 15 children aged 5 or under. Suppose then we wish to estimate the proportion of children under age 5 who are underweight with 10% precision. C 1 /C 2 ICC Precision n m Children Households 8 Strata (%) Per Stratum Per Stratum 10 0.04 10 16 8 128 854 6832 20 0.04 10 22 7 154 1027 8216 50 0.04 10 35 6 210 1400 11200 100 0.04 10 49 5 245 1634 13072 Blevins and Shepherd (Vandy Biostats) Zambézia Survey 18 / 23
Estimating change from Baseline to 5 years Stratify on regions where particular interventions were performed Blevins and Shepherd (Vandy Biostats) Zambézia Survey 19 / 23
Estimating change from Baseline, cont d If both baseline and 5 year surveys are designed for given precision: Within strata: Precision Detectable difference Detectable difference at Baseline and 5 Year independent samples same clusters 2.5% 0.035 0.030 5% 0.07 0.06 10% 0.14 0.12 Zambézia-wide: Precision Detectable difference Detectable difference at Baseline and 5 Year independent samples same clusters 0.9% 0.013 0.010 1.8% 0.025 0.021 3.5% 0.050 0.042 Blevins and Shepherd (Vandy Biostats) Zambézia Survey 20 / 23
Estimating intervention effect What is the detectable difference for intervention effect assuming half of the clusters get intervention within each stratum? If both baseline and 5 year surveys are designed for given precision: Within strata: Precision Detectable difference Detectable difference at Baseline and 5 Year independent samples same clusters 2.5% 0.07 0.06 5% 0.14 0.12 10% 0.28 0.24 Zambézia-wide: Precision Detectable difference Detectable difference at Baseline and 5 Year independent samples same clusters 0.9% 0.025 0.021 1.8% 0.05 0.042 3.5% 0.10 0.084 Blevins and Shepherd (Vandy Biostats) Zambézia Survey 21 / 23
Additional Considerations Nonresponse Survey Administration (i.e., stop once enough of one cohort is collected) Determing which areas receive intervention Blevins and Shepherd (Vandy Biostats) Zambézia Survey 22 / 23
References Aliago A, Ruilin R. Optimal Sample Sizes for Two-Stage Cluster Sampling in Demographic and Health Surveys Demographic and Health Research 30(2006). Cochran WG. Sampling Techniques. New York, NY: Wiley; 1977. Bennet S, Woods T, Liyanage WM, Smith DL. A Simplified General Method for Cluster-Sample Surveys of Health in Developing Countries World Health Statistics Quarterly 44(1991). Blevins and Shepherd (Vandy Biostats) Zambézia Survey 23 / 23