Weighting in Survey Sampling

Size: px
Start display at page:

Download "Weighting in Survey Sampling"

Transcription

1 Weighting in Survey Sampling Geert Molenberghs Interuniversity Institute for Biostatistics and statistical Bioinformatics Universiteit Hasselt, Belgium Katholieke Universiteit Leuven, Belgium EUROSTAT, March 29, 2011

2 The Belgian Health Interview Survey Conducted in years: Commissioned by: Federal government Flemish Community French Community German Community Walloon Region Brussels Region EUROSTAT, March 29,

3 Design At-a-Glance Regional stratification: fixed a priori Provincial stratification: for convenience Three-stage sampling: Primary sampling units (PSU): Municipalities: proportional to size Secondary sampling units (SSU): Households Tertiary sampling units (TSU): Individuals Over-representation of German Community Over-representation of 4 (2) provinces in 2001 (2004): Limburg Hainaut Antwerpen Luxembourg Sampling done in 4 quarters: Q1, Q2, Q3, Q4 EUROSTAT, March 29,

4 Regional Stratification ( Weights) Region Goal Obt d Goal Obt d Goal Obt d Flanders = elderly +450= Wallonia = elderly +450= Brussels elderly +350= Belgium 10,000 10,221 10, =12,050 12,111 10, elderly +1250=12,600 12,945 EUROSTAT, March 29,

5 Provincial Stratification in 1997 Province sample # sample % pop. % Antwerpen Oost-Vlaanderen West-Vlaanderen Vlaams-Brabant Limburg Hainaut Liège Namur Brabant-Wallon Luxembourg Brussels 3051 EUROSTAT, March 29,

6 Multi-Stage Sampling: Primary Sampling Units Towns Within each province, order communities size Systematically sample in groups of 50 EUROSTAT, March 29,

7 Representation with certainty of larger cities. For 1997: Antwerpen: 6 groups Liège and Charlerloi: 4 groups each Gent: 3 groups Mons and Namur: 2 groups each All towns in Brussels Representation ensured of respondents, living in smaller towns EUROSTAT, March 29,

8 Multi-Stage Sampling: Secondary Sampling Units Households List of households, ordered following statistical sector age of reference person size of household clusters of 4 households selected households within clusters randomized twice as many clusters as households needed, to account for refusal and non-responders EUROSTAT, March 29,

9 Multi-stage Sampling: Tertiary Sampling Units ( Weights) Individual Respondents Households of size 4: all members Households of size 5: reference person and partner (if applicable) other households members selected on birthday rule in 1997 or by prior sampling from household members in 2001 and 2004 maximum of 4 interviews per household EUROSTAT, March 29,

10 W e i g h t s Stratification Region Province Age of reference person Household size Quarter Multi-stage sampling Selection probability of individual within household Taking this into account is relatively easy, even with standard software EUROSTAT, March 29,

11 Design Analysis Weights & selection probabilities Stratification Multi-stage sampling & clustering Incomplete data EUROSTAT, March 29,

12 Simple Random Sampling (SRS) We need the following information: Population P Population size N Sample size n Whether sampling is done with or without replacement The sample fraction: f = n N No need for weights EUROSTAT, March 29,

13 Stratification (STRAT) Population P Population size N Sample size n Whether sampling is done with or without replacement The strata indicators h = 1,...,H The number of subjects in stratum h: I = 1,...,N h, with N = H N h h=1 This defines the subpopulations, or population strata, P h EUROSTAT, March 29,

14 The way the sample of n units is allocated to the strata: n h, with n = H n h h=1 We can calculate the stratum-specific sample fraction: f h variable need for weights f h = n h N h EUROSTAT, March 29,

15 Multi-Stage Sampling: the Relative Approach Selection probabilities Stage 1: f 1 Stage 2: f 2 Total: f = f 1 f 2 a = 1 10 b = 1 10 c = 1 10 d = 1 10 EUROSTAT, March 29,

16 Multi-Stage Sampling: the Absolute Approach Assume N, n, and hence f are prespecified. Fix the number of SSU taken per PSU: n c. Construct a cumulative list of the number of SSU per PSU. Conduct systematic selection within the cumulative list, with jump g = 1 f n c = = 100 EUROSTAT, March 29,

17 Sample selection block # houses cumulative hits Selection probabilities block houses prob.(1) prob.(2) prob.(tot) /100 10/87 1/ /100 10/109 1/ /100 10/15 1/10 EUROSTAT, March 29,

18 Sample Sizes: The Belgian Health Interview Survey Allocations for Belgian Health Interview Survey Focus on Region N h population strata compromise Brussels 1,000, Flanders 6,000, Wallonia 3,000, EUROSTAT, March 29,

19 Simple Random Sampling: Estimators y = 1 n n i=1 y i ŷ = N n n i=1 y i Quantity Calculated Estimated Population variance S 2 Y = 1 N 1 N I=1 (Y I Y ) 2 ŝ 2 y = 1 n 1 n i=1 (y i y) 2 Total Average σ 2 ŷ = N2 n (1 f)s2 Y σ 2 y = 1 n (1 f)s2 Y σ 2 ŷ = N2 n (1 f)ŝ2 y σ 2 y = 1 n (1 f)ŝ2 y EUROSTAT, March 29,

20 Estimators: Stratification The total of the sub-sample within stratum h: y h = n h Estimator for the stratum-specific total: i=1 y hi ŷ h = N h n h n h y hi = N h y h i=1 n h Estimator for the population total: ŷ = H ŷ h h=1 It is the unweighted average of the stratum-specific totals. EUROSTAT, March 29,

21 Estimator for the stratum-specific average: y h = 1 n h n h Estimator for the population average: i=1 y hi = 1 n h y h y = 1 Nŷ = 1 N H h=1 ŷ h = H h=1 N h N y h The estimator for the population average is a weighted sum of the stratum-specific averages. EUROSTAT, March 29,

22 Estimators: Weighting y = n i=1 w iy i n i=1 w i ŷ = N n i=1 w iy i n i=1 w i EUROSTAT, March 29,

23 Analysis of Belgian Health Interview Survey Body Mass Index (BMI): Defined as: A continuous measure BMI = weight (kg) height 2 (m 2 ) Frequently analyzed on the log scale: ln(bmi) kg m 2 General Health Questionnaire 12 (GHQ-12): Comprises 12 questions, yielding a 13 category outcome The focus is on mental health Can be dichotomized as well EUROSTAT, March 29,

24 Vragenlijst voor Onderzoek naar de Ervaren Gezondheid (VOEG): Dutch instrument, leading to a sum score Questionnaire for Research Regarding Subjective Health Score translated into French for Belgium to obtain a more symmetric score, the analysis takes place on the log scale: ln(voeg + 1) Stable General Practioner (SGP): Do you have a steady general practitioner? (GP) Obviously a binary indicator EUROSTAT, March 29,

25 Logarithm of Body Mass Index Analysis Belgium Brussels Flanders Wallonia SRS ( ) ( ) ( ) ( ) Stratification ( ) ( ) ( ) ( ) Clustering ( ) ( ) ( ) ( ) Weighting ( ) ( ) ( ) ( ) All combined ( ) ( ) ( ) ( ) Logarithm of VOEG Score Analysis Belgium Brussels Flanders Wallonia SRS ( ) ( ) ( ) ( ) Stratification ( ) ( ) ( ) ( ) Clustering ( ) ( ) ( ) ( ) Weighting ( ) ( ) ( ) ( ) All combined ( ) ( ) ( ) ( ) EUROSTAT, March 29,

26 General Health Questionnaire 12 Analysis Belgium Brussels Flanders Wallonia SRS ( ) ( ) ( ) ( ) Stratification ( ) ( ) ( ) ( ) Clustering ( ) ( ) ( ) ( ) Weighting ( ) ( ) ( ) ( ) All combined ( ) ( ) ( ) ( ) Stable General Practitioner (0/1) Analysis Belgium Brussels Flanders Wallonia SRS ( ) ( ) ( ) ( ) Stratification ( ) ( ) ( ) ( ) Clustering ( ) ( ) ( ) ( ) Weighting ( ) ( ) ( ) ( ) All combined ( ) ( ) ( ) ( ) Weighting and clustering each increase the standard error, the combined analysis does more so. The point estimate is identical to the weighted one. EUROSTAT, March 29,

27 Design Effects Outcome Belgium Brussels Flanders Wallonia Design Effects for Clustering LNBMI LNVOEG GHQ SGP Design Effects for Weighting LNBMI LNVOEG GHQ SGP EUROSTAT, March 29,

28 Weighting: General Concepts and Design The concept of weighting Weighting in the context of stratification Weighting in the context of clustering Selection proportional to size (PPS) Self-weighting Examples EUROSTAT, March 29,

29 General Principles Weighting arises naturally in a variety of contexts: With stratification: different strata have different selection probabilities. With clustering: weights differ within and between clusters. Incomplete data: to correct for non-response. In general: units are given probabilities of selection, e.g., proportional to their size. We will consider the main ones in turn. EUROSTAT, March 29,

30 Estimators for averages and total then take the form: y = n w iy i i=1 n w, i i=1 ŷ = N n w iy i i=1 n w. i i=1 The unweighted expressions result from setting all w i equal to a constant. Due to the division by the sum of weights, the actual constant is not important, but sensible choices are 1 or 1/n. EUROSTAT, March 29,

31 Weighting and Stratification There are two main reasons why selection probabilities are different between strata: A subgroup is of interest and not oversampling would lead to too small a sample size. Example: German Region in the Belgian HIS. Strata are given equal sample sizes for comparative purposes, but also an estimate for the entire population is required. Example: Brussels, Flanders, and Wallonia in the Belgian HIS. Units are then reweighted to ensure proper representativity. EUROSTAT, March 29,

32 Example Suppose a certain subgroup represents 10% of the population. With an unweighted scheme (SRS or stratified), this group will also contribute 10% to the sample, on average. If we need a sample which includes 100 individuals of the subgroup, then a total sample of 1000 individuals has to be selected. Enlarging the subgroup with 50% implies scaling up from 100 to 150, and hence 500 additional interviews for the entire sample are needed. It is perfectly possible that 50 extra interviews in the subgroup are essential, but that the other 450 are redundant. EUROSTAT, March 29,

33 A solution is to increase the selection probability for the subgroup, relative to the others. Quantity Majority Minority Population Percentage Sample portion 1/10 1/5 Number selected Unweighted percentage in sample Weight 1 1/2 Weighted number in sample Weighted percentage in sample EUROSTAT, March 29,

34 Unfortunately, it is not always possible to pre-determine whether a respondent belongs to the majority or to the minority. This implies that determining the weight is difficult. As a surrogate, entire quarters (or other geographical entities) which are known to have large minority populations can be oversampled. This procedure works, since the weighting is done at the quarter level, hence producing correct weights, such as in the example above. If one calculates the subsample selection probability carefully, then it can be ensured that the sample will contain a sufficient number of minority members. EUROSTAT, March 29,

35 Example: Artificial Population Consider classical stratification: P s1 = ( ) P s2 = ( ) Samples are selected proportional to the stratum size: 1 out of 2 units in each: n = (1, 1). Consider a third stratification: P s3 = ( ) Retain the sample size n = (1, 1) EUROSTAT, March 29,

36 The sampling mechanisms then are: P s Stratified s Sample SRS P s1 P s2 P s3 1 {1,2} 1/6 0 1/4 1/3 2 {1,3} 1/6 1/4 1/4 1/3 3 {1,4} 1/6 1/4 0 1/3 4 {2,3} 1/6 1/ {2,4} 1/6 1/4 1/4 0 6 {3,4} 1/6 0 1/4 0 EUROSTAT, March 29,

37 The corresponding estimators are: ŷ Stratified s Sample SRS P s1 P s2 P s3 1 {1,2} {1,3} {1,4} {2,3} {2,4} {3,4} EUROSTAT, March 29,

38 The expectations for the total: P s1 : E(y) = 1 4 [ ] = 10 P s2 : E(y) = 1 4 [ ] = 10 P s2 : E(y) = 1 3 [ ] = 10 Hence, also the third stratification produces an unbiased estimator. EUROSTAT, March 29,

39 Very important: The estimates differ depending on the sampling mechanism. Indeed, the sample {1, 2} produces 6 in the unweighted case and 7 in this weighted case. This is because the weighted expression is used. For example: ŷ = 4 1 1/ /3 1 1/ /3. The weights are the inverse of the selection probability. EUROSTAT, March 29,

40 Weighting and Multi-Stage Sampling / Clustering In multi-stage sampling and clustering, subunits may be selected with differential probabilities. Example: Household members in the Belgian HIS. In addition, entire clusters may be selected with variable probabilities. Example: Towns in the Belgian HIS. Just like in the stratified case, this needs to be taken into account via weights. EUROSTAT, March 29,

41 Example Consider a selection of households from a population with two household types: person households of married couples person households of singles. Obviously: 50% of the households consist of married couples. 66.7% of the people are married. Select a sample of 100 households, and then one person per household. We expect, on average, in the sample: 50 married persons. 50 unmarried persons. EUROSTAT, March 29,

42 If the survey question is: Are your married? then a naive estimate would produce: ẑ = 50% are married, which is wrong. Weighting the answers by the relative selection probabilities: ẑ 1 = / / / /1 = = In case we want to assess the proportion of married households, then no weighting is necessary: ẑ 2 = = = 0.5 EUROSTAT, March 29,

43 Example: Artificial Population Consider three ways of clustering: P c1 = ({1, 3}, {2, 4}) P c2 = ({1, 2}, {3, 4}) P c3 = ({1, 4}, {2, 3}) Let us add another one: P c4 = ({1}, {2, 3, 4}) EUROSTAT, March 29,

44 The sampling mechanisms for the original clusterings are: P s Clustering s Sample SRS P 1 P 2 P 3 1 {1,2} 1/6 0 1/2 0 2 {1,3} 1/6 1/ {1,4} 1/ /2 4 {2,3} 1/ /2 5 {2,4} 1/6 1/ {3,4} 1/6 0 1/2 0 EUROSTAT, March 29,

45 We cannot merely add the new samples, since they have a different, and in fact differing sample size: S c4 = { {1}, {2, 3, 4} } Let us decide to change the selection probabilities so as to comply with selection proportional to size (PPS): s Sample P s ŷ 1 {1} 1/4 4 2 {2,3,4} 3/4 12 The expectation of the total: P c4 : E(y) = = 10 EUROSTAT, March 29,

46 Example: The Belgian Health Interview Survey Design-based estimation for LNBMI, LNVOEG, GHQ12, and SGP Regression-based estimation for the continuous LNBMI Logistic regression-based estimation for the binary SGP EUROSTAT, March 29,

47 Estimation of Means Taking weighting into account, the means are recomputed for LNBMI LNVOEG GHQ12 SGP The following program can be used: proc surveymeans data=m.bmi_voeg mean stderr; title weighted means - infinite population for Belgium and regions ; where (regionch^= ); domain regionch; weight wfin; var lnbmi lnvoeg ghq12 sgp; run; EUROSTAT, March 29,

48 The program includes the weights by means of the WEIGHT statement. While it would be possible to include a finite sample correction, as we have seen, the impact is so negligible that it has been omitted. The output takes the usual form, with weighting information listed: weighted means - infinite population for Belgium and regions The SURVEYMEANS Procedure Data Summary Number of Observations 8564 Sum of Weights Statistics Std Error Variable Mean of Mean LNBMI LNVOEG GHQ SGP EUROSTAT, March 29,

49 Domain Analysis: REGIONCH Std Error REGIONCH Variable Mean of Mean Brussels LNBMI LNVOEG GHQ SGP Flanders LNBMI LNVOEG GHQ SGP Walloonia LNBMI LNVOEG GHQ SGP Note that the weights were chosen so that they recombine the entire population. The fact that the sum is not around 10 million is due to empty strata. The sum of the weights does not matter for genuine survey procedures, such as the SURVEYMEANS procedure used here. EUROSTAT, March 29,

50 It does matter for some of the model-based procedures, as we will see further in this chapter. We summarize the results and compare them to SRS (and still foreshadow a bit): Logarithm of Body Mass Index Analysis Belgium Brussels Flanders Wallonia SRS ( ) ( ) ( ) ( ) Stratification ( ) ( ) ( ) ( ) Clustering ( ) ( ) ( ) ( ) Weighting ( ) ( ) ( ) ( ) All combined ( ) ( ) ( ) ( ) Logarithm of VOEG Score Analysis Belgium Brussels Flanders Wallonia SRS ( ) ( ) ( ) ( ) Stratification ( ) ( ) ( ) ( ) Clustering ( ) ( ) ( ) ( ) Weighting ( ) ( ) ( ) ( ) All combined ( ) ( ) ( ) ( ) EUROSTAT, March 29,

51 General Health Questionnaire 12 Analysis Belgium Brussels Flanders Wallonia SRS ( ) ( ) ( ) ( ) Stratification ( ) ( ) ( ) ( ) Clustering ( ) ( ) ( ) ( ) Weighting ( ) ( ) ( ) ( ) All combined ( ) ( ) ( ) ( ) Stable General Practitioner (0/1) Analysis Belgium Brussels Flanders Wallonia SRS ( ) ( ) ( ) ( ) Stratification ( ) ( ) ( ) ( ) Clustering ( ) ( ) ( ) ( ) Weighting ( ) ( ) ( ) ( ) All combined ( ) ( ) ( ) ( ) EUROSTAT, March 29,

52 Discussion Unlike with stratification and clustering, the impact is major and differential between outcomes. Recall that an unweighted analysis implicitly assumes the following incorrect facts: the Brussels, Flemish, and Walloon populations are roughly equal members within a household have roughly the same selection probability (other components of the weights are relatively unimportant) Weighting reduces precision: this is reflected throughout in larger standard errors. They all increase, roughly, by a factor 1.5. EUROSTAT, March 29,

53 Let us discuss each of the four outcomes: LNBMI: The regional estimates are relatively stable. The Belgian estimate is stable, too. This is a coincidence, as can be seen from the following rounded computations: General: µbel = w Bru µbru + w Fla µfla + w Wal µwal Unweighted: µbel = = Weighted: µbel = = Hence, the weights shift a low between Flanders and Brussels, but these regions have the same average, as a coincidence. EUROSTAT, March 29,

54 LNVOEG: Here, the situation is rather different: General: µbel = w Bru µbru + w Fla µfla + w Wal µwal Unweighted: µbel = = Weighted: µbel = = Since the two smaller regions have a higher average, the unweighted Belgian average is higher than the weighted Belgian average. This also implies there is a larger impact on the standard error for Belgium. The standard errors for the regions increase with 35, 26, and 40%, while the standard error for Belgium increases with 48%, more than for each of the regions separately. This is because there are two sources of additional variation: (1) variability in the weights; (2) variability between the regional means. EUROSTAT, March 29,

55 GHQ-12: The phenomenon is similar to what was observed for LNVOEG. SGP: The phenomenon is not as extreme, since Brussels and Wallonia are rather different: they do not reinforce each other. But still, weighting downplays the low Brussels estimate and upgrades the high Flemish estimate, producing a higher Belgian average. EUROSTAT, March 29,

56 Regression-Based Estimation for LNMBI Like before, the procedures SURVEYREG and MIXED can be used to take weighting into account. PROC SURVEYREG code is: proc surveyreg data=m.bmi_voeg; title 15. Mean. Surveyreg, weighted, for Belgium ; weight wfin; model lnbmi = ; run; with straightforward syntax and output (for Belgium): Estimated Regression Coefficients Standard Parameter Estimate Error t Value Pr > t Intercept <.0001 PROC MIXED code is: EUROSTAT, March 29,

57 proc mixed data=m.bmi_voeg method=reml; title 25. Survey mean with PROC MIXED, for Belgium; title2 weighted ; where (regionch^= ); weight wfin; model lnbmi = / solution; run; There is no need for a RANDOM statement, since no clustering is taken into account. The relevant portion of the output for Belgium is: Solution for Fixed Effects Standard Effect Estimate Error DF t Value Pr > t Intercept <.0001 While the estimate is similar, the standard error is considerably smaller. EUROSTAT, March 29,

58 An overview of the results: Logarithm of Body Mass Index Analysis Procedure Belgium Brussels Flanders Wallonia SRS SURVEYMEANS (0.0018) (0.0034) (0.0030) (0.0032) SRS MIXED (0.0018) (0.0034) (0.0030) (0.0032) Stratification SURVEYMEANS (0.0018) (0.0034) (0.0030) (0.0032) Clustering SURVEYMEANS (0.0020) (0.0036) (0.0033) (0.0034) Clustering MIXED (0.0020) (0.0036) (0.0033) (0.0034) Weighting SURVEYMEANS (0.0027) (0.0046) (0.0039) (0.0042) Weighting MIXED (0.0018) (0.0034) (0.0030) (0.0032) All combined SURVEYMEANS (0.0040) (0.0048) (0.0043) (0.0044) Clust+Wgt MIXED (0.0023) (0.0039) (0.0036) (0.0038) EUROSTAT, March 29,

59 Advanced Topic: Analysis Selection Proportional to Size Self-weighting Horvitz-Thompson estimator Examples EUROSTAT, March 29,

60 Selection Proportional to Size and Self-Weighting Define an estimator of the cluster-specific total as: ŷ i = 1 f i n i j=1 y ij = 1 f i y i Define an estimator for the population total as: ŷ = m i=1 = m i=1 = m i=1 1 m 1 ŷ i π i 1 m 1 1 π i f i n i j=1 y ij 1 m 1 π i 1 f i y i EUROSTAT, March 29,

61 where f i is the sample fraction in selected cluster i π i is the probability to select cluster i y ij is the value of the survey variable for subject j in cluster i EUROSTAT, March 29,

62 Self-Weighting Self-weighting is defined by requiring to be constant. f = n π i f i Hence, the estimator for the total reduces to: ŷ = m i=1 = m i=1 1 m 1 1 π i 1 n i f y ij j=1 f i n i j=1 y ij = 1 f y EUROSTAT, March 29,

63 For the Belgian Health Interview Survey: π i t i (town size) f i 50 t i n π i f i n t i 50 t i a constant Hence: the selection of respondents within towns is self-weighting. EUROSTAT, March 29,

64 Variances for PPS Quantity Expression Pop. var. 1 S 2 1Y = M I=1 π I Y I Mπ I Y 2 = 1 M 2 M Y I π I I=1 π I Y 2 Pop. var. 2 S 2 2Y = N2 N n M I=1 N I N NI n N I 1 N I 1 N I J=1 (Y IJ Y J ) 2 PPS (with) PPS (without) σ 2 ŷ = M2 m S2 1Y + M2 m N2 n σ 2 ŷ = M2 m M I=1 π I 1 nπ I 1 π I n 1 S 2 N 2Y Y I Mπ I Y + M2 m N n n 1 S 2 N 2Y EUROSTAT, March 29,

65 The Horvitz-Thompson Estimator The Horvitz-Thompson (HT) is general and broadly applicable. It can be a bit unstable at times. Alternatives, such as the Hansen-Hurwitz estimator exist. Let y i : total for cluster i (which can simply be an individual in the non-clustered case) π i : probability of selecting cluster i v: number of distinct clusters sampled Note that v m, with equality holding when sampling without replacement. EUROSTAT, March 29,

66 The Hovitz-Thompson estimator takes the form: ŷ HT = v i=1 y i π i The variance: σ 2 ŷ HT = M 1 π I YI 2 + M I=1 π I I=1 J I π IJ π I π J π I π J Y I Y J with now in addition = M 1 π I Y I=1 π I M 1 I I=1 M J=I+1 π IJ π I π J π I π J Y I Y J π IJ : probability of simultaneously selecting clusters I and J into the sample. EUROSTAT, March 29,

67 The Artificial Population and Horvitz-Thompson We will consider three situations SRS without replacement SRS with replacement Selection with unequal probabilities In all cases, n = 2 will be maintained. EUROSTAT, March 29,

68 SRS Without Replacement The clusters in the population are: P = {1}, {2}, {3}, {4} with samples: S = {1, 2}, {1, 3}, {1, 4}, {2, 3}, {2, 4}, {3, 4} The probability of selecting a 1 (or any other unit) is π I = 3 6 = 1 2 EUROSTAT, March 29,

69 The estimator: ŷ HT = y 1 1/2 + y 2 1/2 = 2(y 1 + y 2 ) = 2 y The variance: σ 2 ŷ HT = 4 I=1 1 π I π I Y 2 I I=1 4 J=I+1 π IJ π I π J π I π J Y I Y J = T 1 + T 2 with EUROSTAT, March 29,

70 T 1 = 4 I=1 1 1/2 1/2 Y I 2 = 4 YI 2 I=1 = = 30 π IJ = P(selecting two units simultaneously) = = 1 6 EUROSTAT, March 29,

71 π IJ π I π J π I π J = 1/6 1/2 1/2 1/2 1/2 = 1 3 T 2 = ( ) Hence, = σ 2 ŷ HT = T 1 + T 2 = = 20 3 = EUROSTAT, March 29,

72 Using the classical expressions: σ 2 ŷ = 1 S S s=1 ŷs 1 S 2 S ŷ s s=1 = (6.0 10)2 +( ) 2 +( ) 2 +( ) 2 +( ) 2 +( ) 2 6 = = EUROSTAT, March 29,

73 SRS With Replacement The clusters in the population are: P = {1}, {2}, {3}, {4} with samples: S = {1, 2}, {1, 3}, {1, 4}, {2, 3}, {2, 4}, {3, 4} {1, 1} {1}, {2, 2} {2}, {3, 3} {3}, {4, 4} {4} EUROSTAT, March 29,

74 The probability of selecting a 1 (or any other unit) is π I = 1 4 P(sample with 1 element) P(sample with 2 elements) The estimator: = = 7 16 In a sample with one element: In a sample with two elements: ŷ HT = y 1 7/16 = 16 7 y 1 ŷ HT = y 1 7/16 + y 2 7/16 = 16 7 (y 1 + y 2 ) EUROSTAT, March 29,

75 Enumeration of the estimator: s Sample P s ŷ ŷ HT 1 {1,2} 2/ /7= {1,3} 2/ /7= {1,4} 2/ /7= {2,3} 2/ /7= {2,4} 2/ /7= {3,4} 2/ /7= {1,1} 1/ /7= {2,2} 1/ /7= {3,3} 1/ /7= {4,4} 1/ /7=9.14 EUROSTAT, March 29,

76 The expectation of the estimator: E(ŷHT) = = 70 7 = 10 Thus, the estimator is unbiased, but different from the classical one. The variance: σ 2 ŷ HT = 4 I=1 1 π I π I Y 2 I I=1 4 J=I+1 π IJ π I π J π I π J Y I Y J = T 1 + T 2 EUROSTAT, March 29,

77 with T 1 = 4 I=1 1 7/16 7/16 Y I 2 = 9 7 ( ) = π IJ = P(selecting two units simultaneously) = 2 16 EUROSTAT, March 29,

78 π IJ π I π J π I π J = 2/16 7/16 7/16 7/16 7/16 = Hence, T 2 = = 7 = ( ) σ 2 ŷ HT = T 1 + T 2 = = = EUROSTAT, March 29,

79 Using the conventional estimator: σ 2 ŷ = S s=1 P s ŷs S s=1 P sŷs 2 = 2 16 [( )2 + ( ) 2 + ( ) 2 + ( ) 2 + ( ) 2 + ( ) 2 ] [( )2 + (8.0 10) 2 + ( ) 2 + ( ) 2 ] = = 10.0 Hence, the HT estimator is different and less efficient than the ordinary SRS estimator with replacement. EUROSTAT, March 29,

80 Selection With Unequal Probabilities Consider the following set of selection probabilities for the units: Unit p i 1 1/2 2 1/6 3 1/6 4 1/6 EUROSTAT, March 29,

81 Probability of selecting the various samples: Sample p s Sample p s {1,2} 1/2 1/3 = 1/6 {3,1} 1/6 3/5 = 1/10 {1,3} 1/2 1/3 = 1/6 {3,2} 1/6 1/5 = 1/30 {1,4} 1/2 1/3 = 1/6 {3,4} 1/6 1/5 = 1/30 {2,1} 1/6 3/5 = 1/10 {4,1} 1/6 3/5 = 1/10 {2,3} 1/6 1/5 = 1/30 {4,2} 1/6 1/5 = 1/30 {2,4} 1/6 1/5 = 1/30 {4,3} 1/6 1/5 = 1/30 EUROSTAT, March 29,

82 The probabilities of selecting the various units into the samples: π 1 = = 4 5 π 2 = π 3 = π 4 = = 2 5 EUROSTAT, March 29,

83 The estimator: Sample ŷ HT π IJ 1 {1,2} 4/ /5 = {1,3} 4/ /5 = {1,4} 4/ /5 = {2,3} 2/ /5 = {2,4} 2/ /5 = {3,4} 2/ /5 = = = = = = = 1 15 EUROSTAT, March 29,

84 The expectation of the estimator: E(ŷHT) = 4 15 = = The variance: σ 2 ŷ HT = 4 I=1 1 π I π I Y 2 I I=1 4 J=I+1 π IJ π I π J π I π J Y I Y J = T 1 + T 2 with EUROSTAT, March 29,

85 T 1 = 1 4/5 4/ /5 2/5 ( ) ( π1j π 1 π J T 2 = 2 π 1 π J = 2 = ) ( πij π I π J ( ) + 2 π I π J 4/15 4/5 2/5 ( ) + 2 4/5 2/5 = 2 ( ) = ) I,J 2 ( ) 1/15 2/5 2/5 ( ) 2/5 2/5 Hence, σ 2 ŷ HT = T 1 + T 2 = = = EUROSTAT, March 29,

Belgium. GDP Per Capita, PPS 2001

Belgium. GDP Per Capita, PPS 2001 BELGIUM * 1. REGIONAL DISPARITIES AND PROBLEMS In Belgium, the regional problem is primarily associated with the impact of industrial restructuring and decline. This is especially so in Wallonia where

More information

New SAS Procedures for Analysis of Sample Survey Data

New SAS Procedures for Analysis of Sample Survey Data New SAS Procedures for Analysis of Sample Survey Data Anthony An and Donna Watts, SAS Institute Inc, Cary, NC Abstract Researchers use sample surveys to obtain information on a wide variety of issues Many

More information

Medical Expenditure Panel Survey. Household Component Statistical Estimation Issues. Copyright 2007, Steven R. Machlin,

Medical Expenditure Panel Survey. Household Component Statistical Estimation Issues. Copyright 2007, Steven R. Machlin, Medical Expenditure Panel Survey Household Component Statistical Estimation Issues Overview Annual person-level estimates Overlapping panels Estimation variables Weights Variance Pooling multiple years

More information

BP s impact on the economy in. A report by Oxford Economics December 2017

BP s impact on the economy in. A report by Oxford Economics December 2017 BP s impact on the economy in A report by Oxford Economics December 2017 760 million Gross value added contribution supported by BP in Belgium BP supported BP s activity supported 7,800 0.18% One in every

More information

EXAMPLE 6: WORKING WITH WEIGHTS AND COMPLEX SURVEY DESIGN

EXAMPLE 6: WORKING WITH WEIGHTS AND COMPLEX SURVEY DESIGN EXAMPLE 6: WORKING WITH WEIGHTS AND COMPLEX SURVEY DESIGN EXAMPLE RESEARCH QUESTION(S): How does the average pay vary across different countries, sex and ethnic groups in the UK? How does remittance behaviour

More information

7 Construction of Survey Weights

7 Construction of Survey Weights 7 Construction of Survey Weights 7.1 Introduction Survey weights are usually constructed for two reasons: first, to make the sample representative of the target population and second, to reduce sampling

More information

Interval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems

Interval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems Interval estimation September 29, 2017 STAT 151 Class 7 Slide 1 Outline of Topics 1 Basic ideas 2 Sampling variation and CLT 3 Interval estimation using X 4 More general problems STAT 151 Class 7 Slide

More information

ESTP course on Small Area Estimation

ESTP course on Small Area Estimation ESTP course on Small Area Estimation Statistics Finlan, Helsini, 29 September 2 October 2014 Topic 3: Direct estimators for omains Risto Lehtonen, University of Helsini Risto Lehtonen University of Helsini

More information

Quality Report Belgian SILC2010

Quality Report Belgian SILC2010 Quality Report Belgian SILC2010 Quality Report Belgian SILC2010 1 Contents 0. Introduction 1. Indicators 1.1 Overview of common cross-sectional EU indicators based on the cross-sectional component of EU-SILC

More information

North West Los Angeles Average Price of Coffee in Licensed Establishments

North West Los Angeles Average Price of Coffee in Licensed Establishments North West Los Angeles Average Price of Coffee in Licensed Establishments By Courtney Engel, Natasha Ericta and Ray Luo Statistics 201A Sample Project Professor Xu December 14, 2006 1 1 Background and

More information

Quality Report Belgian SILC2009

Quality Report Belgian SILC2009 Quality Report Belgian SILC2009 Quality Report Belgian SILC2008 1 Contents 0. Introduction 1. Indicators 1.1 Overview of common cross-sectional EU indicators based on the cross-sectional component of EU-SILC

More information

Design of a Multi-Stage Stratified Sample for Poverty and Welfare Monitoring with Multiple Objectives

Design of a Multi-Stage Stratified Sample for Poverty and Welfare Monitoring with Multiple Objectives Policy Research Working Paper 7989 WPS7989 Design of a Multi-Stage Stratified Sample for Poverty and Welfare Monitoring with Multiple Objectives A Bangladesh Case Study Faizuddin Ahmed Dipankar Roy Monica

More information

QUALITY REPORT BELGIAN SILC 2015

QUALITY REPORT BELGIAN SILC 2015 QUALITY REPORT BELGIAN SILC 2015 Quality Report Belgian SILC2015 1 TABLE OF CONTENTS Introduction... 4 1. Indicators... 5 2. Accuracy... 6 2.1. Sampling Design... 6 2.1.1. Type of sampling... 6 2.1.2.

More information

EUR 10 Billion Public Pandbrieven Programme. Reporting Date : Contact Details : Website : Remark : Head of Treasury

EUR 10 Billion Public Pandbrieven Programme. Reporting Date : Contact Details : Website : Remark : Head of Treasury EUR 10 Billion Public Pandbrieven Programme Reporting Date : Reporting Date: 31/10/2018 Date of Previous Report: 28/09/2018 Contact Details : Head of Treasury Jean-François Deschamps +3222226941 jean-francois.deschamps@belfius.be

More information

EUR 10 Billion Public Pandbrieven Programme. Reporting Date : Contact Details : Website : Remark : Head of Treasury

EUR 10 Billion Public Pandbrieven Programme. Reporting Date : Contact Details : Website : Remark : Head of Treasury EUR 10 Billion Public Pandbrieven Programme Reporting Date : Reporting Date: 31/07/2018 Date of Previous Report: 29/06/2018 Contact Details : Head of Treasury Jean-François Deschamps +3222226941 jean-francois.deschamps@belfius.be

More information

Human capital as a factor of growth and employment at the regional level. The case of Belgium

Human capital as a factor of growth and employment at the regional level. The case of Belgium Human capital as a factor of growth and employment at the regional level. The case of Belgium David de la Croix Vincent Vandenberghe March 2004, Department of Economics, Univ. cath. Louvain Abstract This

More information

Applications of Data Analysis (EC969) Simonetta Longhi and Alita Nandi (ISER) Contact: slonghi and

Applications of Data Analysis (EC969) Simonetta Longhi and Alita Nandi (ISER) Contact: slonghi and Applications of Data Analysis (EC969) Simonetta Longhi and Alita Nandi (ISER) Contact: slonghi and anandi; @essex.ac.uk Week 2 Lecture 1: Sampling (I) Constructing Sampling distributions and estimating

More information

Statistics for Managers Using Microsoft Excel 7 th Edition

Statistics for Managers Using Microsoft Excel 7 th Edition Statistics for Managers Using Microsoft Excel 7 th Edition Chapter 7 Sampling Distributions Statistics for Managers Using Microsoft Excel 7e Copyright 2014 Pearson Education, Inc. Chap 7-1 Learning Objectives

More information

GPD-POT and GEV block maxima

GPD-POT and GEV block maxima Chapter 3 GPD-POT and GEV block maxima This chapter is devoted to the relation between POT models and Block Maxima (BM). We only consider the classical frameworks where POT excesses are assumed to be GPD,

More information

Lecture 21: Logit Models for Multinomial Responses Continued

Lecture 21: Logit Models for Multinomial Responses Continued Lecture 21: Logit Models for Multinomial Responses Continued Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University

More information

Sampling and sampling distribution

Sampling and sampling distribution Sampling and sampling distribution September 12, 2017 STAT 101 Class 5 Slide 1 Outline of Topics 1 Sampling 2 Sampling distribution of a mean 3 Sampling distribution of a proportion STAT 101 Class 5 Slide

More information

Incorporating a Finite Population Correction into the Variance Estimation of a National Business Survey

Incorporating a Finite Population Correction into the Variance Estimation of a National Business Survey Incorporating a Finite Population Correction into the Variance Estimation of a National Business Survey Sadeq Chowdhury, AHRQ David Kashihara, AHRQ Matthew Thompson, U.S. Census Bureau FCSM 2018 Disclaimer

More information

RECOMMENDATIONS AND PRACTICAL EXAMPLES FOR USING WEIGHTING

RECOMMENDATIONS AND PRACTICAL EXAMPLES FOR USING WEIGHTING EXECUTIVE SUMMARY RECOMMENDATIONS AND PRACTICAL EXAMPLES FOR USING WEIGHTING February 2008 Sandra PLAZA Eric GRAF Correspondence to: Panel Suisse de Ménages, FORS, Université de Lausanne, Bâtiment Vidy,

More information

1 Answers to the Sept 08 macro prelim - Long Questions

1 Answers to the Sept 08 macro prelim - Long Questions Answers to the Sept 08 macro prelim - Long Questions. Suppose that a representative consumer receives an endowment of a non-storable consumption good. The endowment evolves exogenously according to ln

More information

VARIANCE ESTIMATION FROM CALIBRATED SAMPLES

VARIANCE ESTIMATION FROM CALIBRATED SAMPLES VARIANCE ESTIMATION FROM CALIBRATED SAMPLES Douglas Willson, Paul Kirnos, Jim Gallagher, Anka Wagner National Analysts Inc. 1835 Market Street, Philadelphia, PA, 19103 Key Words: Calibration; Raking; Variance

More information

Logit Models for Binary Data

Logit Models for Binary Data Chapter 3 Logit Models for Binary Data We now turn our attention to regression models for dichotomous data, including logistic regression and probit analysis These models are appropriate when the response

More information

Crash Involvement Studies Using Routine Accident and Exposure Data: A Case for Case-Control Designs

Crash Involvement Studies Using Routine Accident and Exposure Data: A Case for Case-Control Designs Crash Involvement Studies Using Routine Accident and Exposure Data: A Case for Case-Control Designs H. Hautzinger* *Institute of Applied Transport and Tourism Research (IVT), Kreuzaeckerstr. 15, D-74081

More information

A potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in different subsamples

A potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in different subsamples 1.3 Regime switching models A potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in different subsamples (or regimes). If the dates, the

More information

Chapter 2 Uncertainty Analysis and Sampling Techniques

Chapter 2 Uncertainty Analysis and Sampling Techniques Chapter 2 Uncertainty Analysis and Sampling Techniques The probabilistic or stochastic modeling (Fig. 2.) iterative loop in the stochastic optimization procedure (Fig..4 in Chap. ) involves:. Specifying

More information

Weight Smoothing with Laplace Prior and Its Application in GLM Model

Weight Smoothing with Laplace Prior and Its Application in GLM Model Weight Smoothing with Laplace Prior and Its Application in GLM Model Xi Xia 1 Michael Elliott 1,2 1 Department of Biostatistics, 2 Survey Methodology Program, University of Michigan National Cancer Institute

More information

FEEG6017 lecture: The normal distribution, estimation, confidence intervals. Markus Brede,

FEEG6017 lecture: The normal distribution, estimation, confidence intervals. Markus Brede, FEEG6017 lecture: The normal distribution, estimation, confidence intervals. Markus Brede, mb8@ecs.soton.ac.uk The normal distribution The normal distribution is the classic "bell curve". We've seen that

More information

A Wildfire Risk Assessment for Belgium. Overview. The Wildfires of National Wildfire Action Plan. Notes. Notes. Notes. Notes

A Wildfire Risk Assessment for Belgium. Overview. The Wildfires of National Wildfire Action Plan. Notes. Notes. Notes. Notes A for Belgium Prof. dr. ir. ir. Arthur Depicker Faculty of bioscience engineering Ghent University December 11, 2018 (Faculty of bioscience engineering Ghent University)December 11, 2018 1 / 21 1 (Faculty

More information

Risk Decomposition for Portfolio Simulations

Risk Decomposition for Portfolio Simulations Risk Decomposition for Portfolio Simulations Marco Marchioro www.statpro.com Version 1.0 April 2010 Abstract We describe a method to compute the decomposition of portfolio risk in additive asset components

More information

Context Power analyses for logistic regression models fit to clustered data

Context Power analyses for logistic regression models fit to clustered data . Power Analysis for Logistic Regression Models Fit to Clustered Data: Choosing the Right Rho. CAPS Methods Core Seminar Steve Gregorich May 16, 2014 CAPS Methods Core 1 SGregorich Abstract Context Power

More information

Belgium 1997: Survey Information

Belgium 1997: Survey Information Belgium 1997: Survey Information This document is based upon the Methodological guidelines of the Socio-Economic Panel 1997, compiled at the Center for Social Policy in the University of Antwerp. Table

More information

The data definition file provided by the authors is reproduced below: Obs: 1500 home sales in Stockton, CA from Oct 1, 1996 to Nov 30, 1998

The data definition file provided by the authors is reproduced below: Obs: 1500 home sales in Stockton, CA from Oct 1, 1996 to Nov 30, 1998 Economics 312 Sample Project Report Jeffrey Parker Introduction This project is based on Exercise 2.12 on page 81 of the Hill, Griffiths, and Lim text. It examines how the sale price of houses in Stockton,

More information

Calibration approach estimators in stratified sampling

Calibration approach estimators in stratified sampling Statistics & Probability Letters 77 (2007) 99 103 www.elsevier.com/locate/stapro Calibration approach estimators in stratified sampling Jong-Min Kim a,, Engin A. Sungur a, Tae-Young Heo b a Division of

More information

Final Quality Report SILC2010- BELGIUM. Longitudinal report ( )

Final Quality Report SILC2010- BELGIUM. Longitudinal report ( ) Final Quality Report SILC2010- BELGIUM Longitudinal report (2007-2010) 1 0. Introduction This report contains a description of the accuracy, precision and comparability of the Belgian SILC2007 to SILC2010-surveydata.

More information

Exercises on the New-Keynesian Model

Exercises on the New-Keynesian Model Advanced Macroeconomics II Professor Lorenza Rossi/Jordi Gali T.A. Daniël van Schoot, daniel.vanschoot@upf.edu Exercises on the New-Keynesian Model Schedule: 28th of May (seminar 4): Exercises 1, 2 and

More information

Poststratification with PROC SURVEYMEANS

Poststratification with PROC SURVEYMEANS Poststratification with PROC SURVEYMEANS Overview When a population can be partitioned into homogeneous groups and there is significant heterogeneity between those groups, stratified sampling can substantially

More information

Lecture 22. Survey Sampling: an Overview

Lecture 22. Survey Sampling: an Overview Math 408 - Mathematical Statistics Lecture 22. Survey Sampling: an Overview March 25, 2013 Konstantin Zuev (USC) Math 408, Lecture 22 March 25, 2013 1 / 16 Survey Sampling: What and Why In surveys sampling

More information

Business Demography. Introduction. F. Verduyn

Business Demography. Introduction. F. Verduyn Business Demography F. Verduyn Introduction This article analyses the demographic evolution of Belgian companies in the period from 2001 to. In the same way as, when considering the demography of a population,

More information

STRATEGIES FOR THE ANALYSIS OF IMPUTED DATA IN A SAMPLE SURVEY

STRATEGIES FOR THE ANALYSIS OF IMPUTED DATA IN A SAMPLE SURVEY STRATEGIES FOR THE ANALYSIS OF IMPUTED DATA IN A SAMPLE SURVEY James M. Lepkowski. Sharon A. Stehouwer. and J. Richard Landis The University of Mic6igan The National Medical Care Utilization and Expenditure

More information

THE UNIVERSITY OF TEXAS AT AUSTIN Department of Information, Risk, and Operations Management

THE UNIVERSITY OF TEXAS AT AUSTIN Department of Information, Risk, and Operations Management THE UNIVERSITY OF TEXAS AT AUSTIN Department of Information, Risk, and Operations Management BA 386T Tom Shively PROBABILITY CONCEPTS AND NORMAL DISTRIBUTIONS The fundamental idea underlying any statistical

More information

Chapter 5 Basic Probability

Chapter 5 Basic Probability Chapter 5 Basic Probability Probability is determining the probability that a particular event will occur. Probability of occurrence = / T where = the number of ways in which a particular event occurs

More information

Linear Regression with One Regressor

Linear Regression with One Regressor Linear Regression with One Regressor Michael Ash Lecture 9 Linear Regression with One Regressor Review of Last Time 1. The Linear Regression Model The relationship between independent X and dependent Y

More information

Survey conducted by GfK On behalf of the Directorate General for Economic and Financial Affairs (DG ECFIN)

Survey conducted by GfK On behalf of the Directorate General for Economic and Financial Affairs (DG ECFIN) FINANCIAL SERVICES SECTOR SURVEY Final Report April 217 Survey conducted by GfK On behalf of the Directorate General for Economic and Financial Affairs (DG ECFIN) Table of Contents 1 Introduction... 3

More information

Survey Sampling, Fall, 2006, Columbia University Homework assignments (2 Sept 2006)

Survey Sampling, Fall, 2006, Columbia University Homework assignments (2 Sept 2006) Survey Sampling, Fall, 2006, Columbia University Homework assignments (2 Sept 2006) Assignment 1, due lecture 3 at the beginning of class 1. Lohr 1.1 2. Lohr 1.2 3. Lohr 1.3 4. Download data from the CBS

More information

Planning Sample Size for Randomized Evaluations Esther Duflo J-PAL

Planning Sample Size for Randomized Evaluations Esther Duflo J-PAL Planning Sample Size for Randomized Evaluations Esther Duflo J-PAL povertyactionlab.org Planning Sample Size for Randomized Evaluations General question: How large does the sample need to be to credibly

More information

Stratified Sampling in Monte Carlo Simulation: Motivation, Design, and Sampling Error

Stratified Sampling in Monte Carlo Simulation: Motivation, Design, and Sampling Error South Texas Project Risk- Informed GSI- 191 Evaluation Stratified Sampling in Monte Carlo Simulation: Motivation, Design, and Sampling Error Document: STP- RIGSI191- ARAI.03 Revision: 1 Date: September

More information

The Serbia 2013 Enterprise Surveys Data Set

The Serbia 2013 Enterprise Surveys Data Set I. Introduction The Serbia 2013 Enterprise Surveys Data Set 1. This document provides additional information on the data collected in Serbia between January 2013 and August 2013 as part of the fifth round

More information

Module 2: Monte Carlo Methods

Module 2: Monte Carlo Methods Module 2: Monte Carlo Methods Prof. Mike Giles mike.giles@maths.ox.ac.uk Oxford University Mathematical Institute MC Lecture 2 p. 1 Greeks In Monte Carlo applications we don t just want to know the expected

More information

EUR 10 Billion Mortgage Pandbrieven Programme

EUR 10 Billion Mortgage Pandbrieven Programme EUR 10 Billion Mortgage Pandbrieven Programme Reporting Date: Reporting Date: 3/12/2013 Date of Previous report: 7/11/2013 Contact Details: Head of Treasury Wilfried Wouters 0032 2 222 5718 wilfried.wouters@belfius.be

More information

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg :

More information

The Binomial Distribution

The Binomial Distribution Patrick Breheny September 13 Patrick Breheny University of Iowa Biostatistical Methods I (BIOS 5710) 1 / 16 Outcomes and summary statistics Random variables Distributions So far, we have discussed the

More information

SAMPLE ALLOCATION AND SELECTION FOR THE NATIONAL COMPENSATION SURVEY

SAMPLE ALLOCATION AND SELECTION FOR THE NATIONAL COMPENSATION SURVEY SAMPLE ALLOCATION AND SELECTION FOR THE NATIONAL COMPENSATION SURVEY Lawrence R. Ernst, Christopher J. Guciardo, Chester H. Ponikowski, and Jason Tehonica Ernst_L@bls.gov, Guciardo_C@bls.gov, Ponikowski_C@bls.gov,

More information

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon. Chapter 14: random variables p394 A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon. Consider the experiment of tossing a coin. Define a random variable

More information

STEP Survey Weighting Procedures Summary (Based on The World Bank Weight Requirement) Lao PDR. October 11, 2013

STEP Survey Weighting Procedures Summary (Based on The World Bank Weight Requirement) Lao PDR. October 11, 2013 October 11, 2013 STEP Survey Weighting Procedures Summary (Based on The World Bank Weight Requirement) Lao PDR October 11, 2013 2 October 11, 2013 Table of Contents 1 Survey Design Overview... 1 2 Data

More information

The Two-Sample Independent Sample t Test

The Two-Sample Independent Sample t Test Department of Psychology and Human Development Vanderbilt University 1 Introduction 2 3 The General Formula The Equal-n Formula 4 5 6 Independence Normality Homogeneity of Variances 7 Non-Normality Unequal

More information

Current Population Survey (CPS)

Current Population Survey (CPS) Current Population Survey (CPS) 1 Background The Current Population Survey (CPS), sponsored jointly by the U.S. Census Bureau and the U.S. Bureau of Labor Statistics (BLS), is the primary source of labor

More information

CLUSTER SAMPLING. 1 Estimation of a Population Mean and Total. 1.1 Notations. 1.2 Estimators. STAT 631 Survey Sampling Fall 2003

CLUSTER SAMPLING. 1 Estimation of a Population Mean and Total. 1.1 Notations. 1.2 Estimators. STAT 631 Survey Sampling Fall 2003 CLUSTER SAMPLING Definition 1 A cluster sample is a probability sample in which each sampling unit is a collection, or cluster, of elements. Cluster sampling is less costly than simple or stratified random

More information

Planning Sample Size for Randomized Evaluations

Planning Sample Size for Randomized Evaluations Planning Sample Size for Randomized Evaluations Jed Friedman, World Bank SIEF Regional Impact Evaluation Workshop Beijing, China July 2009 Adapted from slides by Esther Duflo, J-PAL Planning Sample Size

More information

Comparative Study of Electoral Systems (CSES) Module 4: Design Report (Sample Design and Data Collection Report) September 10, 2012

Comparative Study of Electoral Systems (CSES) Module 4: Design Report (Sample Design and Data Collection Report) September 10, 2012 Comparative Study of Electoral Systems 1 Comparative Study of Electoral Systems (CSES) (Sample Design and Data Collection Report) September 10, 2012 Country: Norway Date of Election: September 8-9 th 2013

More information

Risk management. Introduction to the modeling of assets. Christian Groll

Risk management. Introduction to the modeling of assets. Christian Groll Risk management Introduction to the modeling of assets Christian Groll Introduction to the modeling of assets Risk management Christian Groll 1 / 109 Interest rates and returns Interest rates and returns

More information

EUR 10 Billion Mortgage Pandbrieven Programme. Reporting Date : Contact Details : Website : Remark : Head of Treasury

EUR 10 Billion Mortgage Pandbrieven Programme. Reporting Date : Contact Details : Website : Remark : Head of Treasury EUR 10 Billion Mortgage Pandbrieven Programme Reporting Date : Reporting Date: 30/09/2014 Date of Previous Report: 29/08/2014 Contact Details : Head of Treasury Jean-François Deschamps +3222226941 jean-francois.deschamps@belfius.be

More information

A Stochastic Reserving Today (Beyond Bootstrap)

A Stochastic Reserving Today (Beyond Bootstrap) A Stochastic Reserving Today (Beyond Bootstrap) Presented by Roger M. Hayne, PhD., FCAS, MAAA Casualty Loss Reserve Seminar 6-7 September 2012 Denver, CO CAS Antitrust Notice The Casualty Actuarial Society

More information

FINAL EXAM STAT 5201 Spring 2011

FINAL EXAM STAT 5201 Spring 2011 FINAL EXAM STAT 5201 Spring 2011 Due in Room 313 Ford Hall Friday May 13 at 3:45 PM Please deliver to the office staff of the School of Statistics READ BEFORE STARTING You must work alone and may discuss

More information

Survey conducted by GfK On behalf of the Directorate General for Economic and Financial Affairs (DG ECFIN)

Survey conducted by GfK On behalf of the Directorate General for Economic and Financial Affairs (DG ECFIN) FINANCIAL SERVICES SECTOR SURVEY Report April 2015 Survey conducted by GfK On behalf of the Directorate General for Economic and Financial Affairs (DG ECFIN) Table of Contents 1 Introduction... 3 2 Survey

More information

Let us assume that we are measuring the yield of a crop plant on 5 different plots at 4 different observation times.

Let us assume that we are measuring the yield of a crop plant on 5 different plots at 4 different observation times. Mixed-effects models An introduction by Christoph Scherber Up to now, we have been dealing with linear models of the form where ß0 and ß1 are parameters of fixed value. Example: Let us assume that we are

More information

The Normal Distribution & Descriptive Statistics. Kin 304W Week 2: Jan 15, 2012

The Normal Distribution & Descriptive Statistics. Kin 304W Week 2: Jan 15, 2012 The Normal Distribution & Descriptive Statistics Kin 304W Week 2: Jan 15, 2012 1 Questionnaire Results I received 71 completed questionnaires. Thank you! Are you nervous about scientific writing? You re

More information

DE CHAZAL DU MEE BUSINESS SCHOOL AUGUST 2003 MOCK EXAMINATIONS STA 105-M (BASIC STATISTICS) READ THE INSTRUCTIONS BELOW VERY CAREFULLY.

DE CHAZAL DU MEE BUSINESS SCHOOL AUGUST 2003 MOCK EXAMINATIONS STA 105-M (BASIC STATISTICS) READ THE INSTRUCTIONS BELOW VERY CAREFULLY. DE CHAZAL DU MEE BUSINESS SCHOOL AUGUST 003 MOCK EXAMINATIONS STA 105-M (BASIC STATISTICS) Time: hours READ THE INSTRUCTIONS BELOW VERY CAREFULLY. Do not open this question paper until you have been told

More information

Introduction to Meta-Analysis

Introduction to Meta-Analysis Introduction to Meta-Analysis by Michael Borenstein, Larry V. Hedges, Julian P. T Higgins, and Hannah R. Rothstein PART 2 Effect Size and Precision Summary of Chapter 3: Overview Chapter 5: Effect Sizes

More information

TABLE OF CONTENTS - VOLUME 2

TABLE OF CONTENTS - VOLUME 2 TABLE OF CONTENTS - VOLUME 2 CREDIBILITY SECTION 1 - LIMITED FLUCTUATION CREDIBILITY PROBLEM SET 1 SECTION 2 - BAYESIAN ESTIMATION, DISCRETE PRIOR PROBLEM SET 2 SECTION 3 - BAYESIAN CREDIBILITY, DISCRETE

More information

Macroeconomic Models of Economic Growth

Macroeconomic Models of Economic Growth Macroeconomic Models of Economic Growth J.R. Walker U.W. Madison Econ448: Human Resources and Economic Growth Summary Solow Model [Pop Growth] The simplest Solow model (i.e., with exogenous population

More information

EUR 10 Billion Mortgage Pandbrieven Programme. Reporting Date : Contact Details : Website : Remark : Head of Treasury

EUR 10 Billion Mortgage Pandbrieven Programme. Reporting Date : Contact Details : Website : Remark : Head of Treasury EUR 10 Billion Mortgage Pandbrieven Programme Reporting Date : Reporting Date: 29/02/2016 Date of Previous Report: 29/01/2016 Contact Details : Head of Treasury Jean-François Deschamps +3222226941 jean-francois.deschamps@belfius.be

More information

Diploma in Business Administration Part 2. Quantitative Methods. Examiner s Suggested Answers

Diploma in Business Administration Part 2. Quantitative Methods. Examiner s Suggested Answers Cumulative frequency Diploma in Business Administration Part Quantitative Methods Examiner s Suggested Answers Question 1 Cumulative Frequency Curve 1 9 8 7 6 5 4 3 1 5 1 15 5 3 35 4 45 Weeks 1 (b) x f

More information

To be two or not be two, that is a LOGISTIC question

To be two or not be two, that is a LOGISTIC question MWSUG 2016 - Paper AA18 To be two or not be two, that is a LOGISTIC question Robert G. Downer, Grand Valley State University, Allendale, MI ABSTRACT A binary response is very common in logistic regression

More information

Anomalies under Jackknife Variance Estimation Incorporating Rao-Shao Adjustment in the Medical Expenditure Panel Survey - Insurance Component 1

Anomalies under Jackknife Variance Estimation Incorporating Rao-Shao Adjustment in the Medical Expenditure Panel Survey - Insurance Component 1 Anomalies under Jackknife Variance Estimation Incorporating Rao-Shao Adjustment in the Medical Expenditure Panel Survey - Insurance Component 1 Robert M. Baskin 1, Matthew S. Thompson 2 1 Agency for Healthcare

More information

Bayesian Linear Model: Gory Details

Bayesian Linear Model: Gory Details Bayesian Linear Model: Gory Details Pubh7440 Notes By Sudipto Banerjee Let y y i ] n i be an n vector of independent observations on a dependent variable (or response) from n experimental units. Associated

More information

Financial Time Series Analysis (FTSA)

Financial Time Series Analysis (FTSA) Financial Time Series Analysis (FTSA) Lecture 6: Conditional Heteroscedastic Models Few models are capable of generating the type of ARCH one sees in the data.... Most of these studies are best summarized

More information

A probability distribution shows the possible outcomes of an experiment and the probability of each of these outcomes.

A probability distribution shows the possible outcomes of an experiment and the probability of each of these outcomes. Introduction In the previous chapter we discussed the basic concepts of probability and described how the rules of addition and multiplication were used to compute probabilities. In this chapter we expand

More information

How to Hit Several Targets at Once: Impact Evaluation Sample Design for Multiple Variables

How to Hit Several Targets at Once: Impact Evaluation Sample Design for Multiple Variables How to Hit Several Targets at Once: Impact Evaluation Sample Design for Multiple Variables Craig Williamson, EnerNOC Utility Solutions Robert Kasman, Pacific Gas and Electric Company ABSTRACT Many energy

More information

ILO-IPEC Interactive Sampling Tools No. 7

ILO-IPEC Interactive Sampling Tools No. 7 ILO-IPEC Interactive Sampling Tools No. 7 Version 1 December 2014 International Programme on the Elimination of Child Labour (IPEC) Fundamental Principles and Rights at Work (FPRW) Branch Governance and

More information

The Armenia 2013 Enterprise Surveys Data Set

The Armenia 2013 Enterprise Surveys Data Set I. Introduction The Armenia 2013 Enterprise Surveys Data Set 1. This document provides additional information on the data collected in Armenia between November 2012 and July 2013 as part of the fifth round

More information

Some aspects of using calibration in polish surveys

Some aspects of using calibration in polish surveys Some aspects of using calibration in polish surveys Marcin Szymkowiak Statistical Office in Poznań University of Economics in Poznań in NCPH 2011 in business statistics simulation study Outline Outline

More information

Programming periods and

Programming periods and EGESIF_16-0014-01 0/01//017 EUROPEAN COMMISSION Guidance on sampling methods for audit authorities Programming periods 007-013 and 014-00 DISCLAIMER: "This is a working document prepared by the Commission

More information

Copyright 2011 Pearson Education, Inc. Publishing as Addison-Wesley.

Copyright 2011 Pearson Education, Inc. Publishing as Addison-Wesley. Appendix: Statistics in Action Part I Financial Time Series 1. These data show the effects of stock splits. If you investigate further, you ll find that most of these splits (such as in May 1970) are 3-for-1

More information

EUR 10 Billion Mortgage Pandbrieven Programme. Reporting Date : Contact Details : Website : Remark : Head of Treasury

EUR 10 Billion Mortgage Pandbrieven Programme. Reporting Date : Contact Details : Website : Remark : Head of Treasury EUR 10 Billion Mortgage Pandbrieven Programme Reporting Date : Reporting Date: 30/04/2018 Date of Previous Report: 29/03/2018 Contact Details : Head of Treasury Jean-François Deschamps +3222226941 jean-francois.deschamps@belfius.be

More information

EUR 10 Billion Mortgage Pandbrieven Programme. Reporting Date : Contact Details : Website : Remark : Head of Treasury

EUR 10 Billion Mortgage Pandbrieven Programme. Reporting Date : Contact Details : Website : Remark : Head of Treasury EUR 10 Billion Mortgage Pandbrieven Programme Reporting Date : Reporting Date: 29/06/2018 Date of Previous Report: 31/05/2018 Contact Details : Head of Treasury Jean-François Deschamps +3222226941 jean-francois.deschamps@belfius.be

More information

SYSM 6304 Risk and Decision Analysis Lecture 2: Fitting Distributions to Data

SYSM 6304 Risk and Decision Analysis Lecture 2: Fitting Distributions to Data SYSM 6304 Risk and Decision Analysis Lecture 2: Fitting Distributions to Data M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu September 5, 2015

More information

EUR 10 Billion Mortgage Pandbrieven Programme. Reporting Date : Contact Details : Website : Remark : Head of Treasury

EUR 10 Billion Mortgage Pandbrieven Programme. Reporting Date : Contact Details : Website : Remark : Head of Treasury EUR 10 Billion Mortgage Pandbrieven Programme Reporting Date : Reporting Date: 31/10/2018 Date of Previous Report: 28/09/2018 Contact Details : Head of Treasury Jean-François Deschamps +3222226941 jean-francois.deschamps@belfius.be

More information

Chapter 7. Sampling Distributions and the Central Limit Theorem

Chapter 7. Sampling Distributions and the Central Limit Theorem Chapter 7. Sampling Distributions and the Central Limit Theorem 1 Introduction 2 Sampling Distributions related to the normal distribution 3 The central limit theorem 4 The normal approximation to binomial

More information

Superiority by a Margin Tests for the Ratio of Two Proportions

Superiority by a Margin Tests for the Ratio of Two Proportions Chapter 06 Superiority by a Margin Tests for the Ratio of Two Proportions Introduction This module computes power and sample size for hypothesis tests for superiority of the ratio of two independent proportions.

More information

Lecture 8. The Binomial Distribution. Binomial Distribution. Binomial Distribution. Probability Distributions: Normal and Binomial

Lecture 8. The Binomial Distribution. Binomial Distribution. Binomial Distribution. Probability Distributions: Normal and Binomial Lecture 8 The Binomial Distribution Probability Distributions: Normal and Binomial 1 2 Binomial Distribution >A binomial experiment possesses the following properties. The experiment consists of a fixed

More information

9. Logit and Probit Models For Dichotomous Data

9. Logit and Probit Models For Dichotomous Data Sociology 740 John Fox Lecture Notes 9. Logit and Probit Models For Dichotomous Data Copyright 2014 by John Fox Logit and Probit Models for Dichotomous Responses 1 1. Goals: I To show how models similar

More information

LESSON 9: BINOMIAL DISTRIBUTION

LESSON 9: BINOMIAL DISTRIBUTION LESSON 9: Outline The context The properties Notation Formula Use of table Use of Excel Mean and variance 1 THE CONTEXT An important property of the binomial distribution: An outcome of an experiment is

More information

BZComparative Study of Electoral Systems (CSES) Module 3: Sample Design and Data Collection Report June 05, 2006

BZComparative Study of Electoral Systems (CSES) Module 3: Sample Design and Data Collection Report June 05, 2006 Comparative Study of Electoral Systems 1 BZComparative Study of Electoral Systems (CSES) Module 3: Sample Design and Data Collection Report June 05, 2006 Country: NORWAY Date of Election: SEPTEMBER 12,

More information

Welcome to EMOS Webinar 26 April Introduction to survey sampling. Ralf Münnich University of Trier Economic and Social Statistics

Welcome to EMOS Webinar 26 April Introduction to survey sampling. Ralf Münnich University of Trier Economic and Social Statistics Welcome to EMOS Webinar 26 April 2017 16.30-18.00 Introduction to survey sampling Ralf Münnich University of Trier Economic and Social Statistics EMOS Webinar Introduction to Survey Sampling Ralf Münnich

More information

APPENDIX A SAMPLE DESIGN

APPENDIX A SAMPLE DESIGN APPENDIX A SAMPLE DESIGN APPENDIX A SAMPLE DESIGN A.1 Introduction The 1995 Eritrea Demographic and Health Survey (EDHS) covered the population residing in private households throughout the country. The

More information

Statistics for Business and Economics

Statistics for Business and Economics Statistics for Business and Economics Chapter 7 Estimation: Single Population Copyright 010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 7-1 Confidence Intervals Contents of this chapter: Confidence

More information