Systematic ad Complex Samplig! Professor Ro Fricker! Naval Postgraduate School! Moterey, Califoria! Readig Assigmet:! Scheaffer, Medehall, Ott, & Gerow! Chapter 7.1-7.4! 1
Goals for this Lecture! Defie systematic samplig! Examples! Estimators (assumig SRS equivalece)! Discuss examples of complex samplig desigs! Explai the Kish grid! Itroduce variace estimatio uder complex desigs! 2
What is Systematic Samplig?! Systematic samplig: Give a list of items, select every k th elemet i the list! Start by radomly selectig the first item from the first k elemets! Basis for how radom searches are doe of cars comig oto a base! Ofte useful for thigs like samplig visitors to a web site! Recetly wrote a samplig methodology for INSURV based o systematic samplig! See http://faculty.ps.edu/rdfricke/docs/nps-or-12-001.pdf! 3
Advatages ad Disadvatages of Systematic Samplig! Advatages:! Ca be easier to perform i the field! Less subject to selectio errors by fieldworkers! Ca provide more iformatio per uit cost tha SRS! Potetial disadvatages:! If list systematically varies i a cycle of approximately every k th item, the ca itroduce a bias i the result! May be harder to estimate variace i some situatios! 4
Whe To Use Systematic Samplig! If probability samplig is too complicated to implemet i the field! E.g., ureasoable to expect INSURV ispectors to either geerate a radom list of items to ispect or to ru aroud the ship/submarie to ispect a radom set of items! Whe geeratig a samplig frame list is impossible or too hard! Ca be more effective ad efficiet to simply survey every k th item ecoutered! E.g., every k th visitor to a web site! 5
Mea Estimatio Summary (Assumig SRS Equivalecy)! Estimator for the mea:! y sy = 1 y i i=1 ysy Variace of y :! Var ( ) = 1 N s 2 Boud o the error of estimatio (margi of error):! 2 Var ysy ( ) = 2 1 N s2 6
Estimatig Totals (Assumig SRS Equivalecy)! Estimator for the total:! ˆτ = N y sy = N y i i=1 ˆ τ Variace of :! Var ( ˆτ ) = Var N ysy ( ) = N 2 1 N s 2 Boud o the error of estimatio (margi of error):! 2 Var ( ˆτ ) = 2N 1 N s2 7
Estimatig Proportios (Assumig SRS Equivalecy)! Estimator for the proportio:! ˆp = y sy = 1 y i i=1 ˆp Variace of :! Var ( ˆp ) = 1 N ˆp ( 1 ˆp ) Boud o the error of estimatio (margi of error):! 2 Var ( ˆp ) = 2 1 N ˆp 1 ˆp ( ) 8
Complex Samplig for Real-World Surveyig! Usually, real world requiremets ad costraits result i complex samplig! Some combiatio of stratificatio ad clusterig alog with uequal samplig probabilities! For example, geographic clusterig arises with face-to-face iterviewer-based surveys! Ofte it s multi-stage clusterig as well! Stratificatio ofte also ecessary to esure desired represetatio i sample! Whe combied, estimatio gets much more complicated! 9
NAEP Samplig Scheme! First stage: 96 PSUs cosistig of metropolita statistical areas (MSAs), a sigle o-msa couty, or a group of cotiguous o-msa couties! About a third of the PSUs are sampled with certaity! Remaider are stratified ad oe selected from each stratum with probability proportioal to size! Secod stage: selectio public ad opublic schools withi the PSUs! For elemetary, middle, ad secodary samples, idepedet samples of schools are selected with probability proportioal to measures of size! Third ad fial stage: 25 to 30 eligible studets are sampled systematically with probabilities desiged to make the overall selectio probabilities approximately costat! Except studets from private schools ad schools with high proportios of black or Hispaic studets oversampled! I 1996 early 150,000 studets were tested from just over 2,000 participatig schools! Source: Gradig the Natio's Report Card: Evaluatig NAEP ad Trasformig the Assessmet of Educatioal Progress, Board o Testig ad Assessmet (BOTA), Natioal Academy of Sciece, 1999. 10
Natioal Survey of Third World Coutry! First step: Stratify sample by state/provice proportioal to populatio! Oversample ay state with less tha 100 or 200 iterviews to allow for state-to-state comparisos! Secod step: Withi state/provice, stratify by urba ad rural! Urba/rural stratificatio used to make sure that all localities are represeted! As a geeral rule, locatios of 10,000 or more classified urba, otherwise classified rural! Third step: Select PSUs withi state/provices ad by urba/rural locatio! Fourth step: Select startig poit withi each PSU for each iterviewer! Startig poits defied as locatios with sufficiet public presece to be kow by local residets, such as schools, markets, etc.! 11
The World Health Survey Illustratio! 12 Source: World Health Orgaizatio. The World Health Survey (WHS): Samplig Guidelies for Participatig Coutries. Accessed olie at http://www.who.it/etity/healthifo/survey/whssampligguidelies.pdf.
House Selectio Via Systematic Samplig! 13
Selectio of Household i Multi-dwellig Structure! 14
Respodet Selectio i Each House! To select the perso to iterview withi a household:! List all adult males ad females aged 18 years ad above i the household o a Kish grid! A Kish grid is essetially a table of radomly geerated umbers! It s a pre-assiged table of radom umbers to fid the perso to be iterviewed! Alterative is the ext-birthday method! Oe respodet is selected usig the grid! Oce the respoded is selected, the iterview is coducted with oly that respodet! 15
Kish Grid (aka Kish Tables) Example! Sequetially work dow the list! Overall Selectio Probabilities Source: Kish, L. (1949). A Procedure for Objective Respodet Selectio Withi the Household, Joural of the America Statistical Associatio, 380-387. 16
Variace Estimatio for Complex Desigs! Complex samplig methods require ostadard methods to estimate variaces! I.e., Ca t just plug the data ito statistical software ad use their stadard errors! (Very rare) exceptio: SRS with large populatio ad low orespose! Software for (some) complex survey desigs:! Free: CENVAR, VPLX, CPLX, EpiIfo! Commercial: SAS, Stata, SUDAAN, WesVar! Two estimatio methods: Taylor series expasio ad Jackkife! 17
Variace Estimatio (Taylor Series)! Taylor series approximatio: coverts ratios ito sums! Example: Variace for weighted mea! y w y w w i i i i= 1 i= 1!assumig a SRS ca be expressed as! ( ) Var y w = 2 Var ( wiyi) + ywvar ( wi) 2 ywcov( wiyi, wi) = 2 ( wi ) 18
Variace Estimatio (Jackkife ad Balaced Repeated Replicatio)! Jackkife ad balaced repeated replicatio methods rely o empirical methods! Basically, resample from data c times! Calculate overall mea as! y 1 c = c γ = 1!ad the estimate variace as! c 1 v y y y cc ( 1) ( ) = ( ) 2 γ γ = 1 y γ 19
What We Have Covered! Defied systematic samplig! Examples! Estimators (assumig SRS equivalece)! Discussed examples of complex samplig desigs! Explaied the Kish grid! Itroduced variace estimatio uder complex desigs! 20