Elemetary Statistics ad Iferece 22S:025 or 7P:025 Lecture 27 1 Elemetary Statistics ad Iferece 22S:025 or 7P:025 Chapter 20 2 D. The Correctio Factor - (page 367) 1992 Presidetial Campaig Texas 12.5 x 106 voters New Mexico 1.2 x 106 voters Radom Sample of 2,500 voters selected i New Mexica & 2,500 voters i Texas New Mexico samples 1 out of 500 Texas samples 1 out 5,000 3 1
It would seem the accuracy i New Mexico would be better because a greater percetage of the populatio is sampled! NOT SO The SE(%) is based o size of sample ot the size of the populatio. p(1 p) SE(%) = 100 where =sample size. 4 Results for SE(%) whe samplig with replacemet whe populatios are large. SE ( without replacemet) = correctio factor SE(with replacemet) umber of tickets i box - umber of draws SE(without replacemet) = SE umber of tickets i box -1 N SE(without replacemet) = p(1 p) N 1 Where N = size of populatio = sample size p = sample proportio 5 Example: Suppose N = 100,000 = 1,000 p =.50 SE(without replacemet) = = 100,000000-1,000 1000(.5)(.5) 99,999.99001 1000(.5)(.5) = (.994922) SE of the box The larger the populatio, the closer the correctio factor is to oe. 6 2
Note the correctio factor approaches 1 as size of the tickets i box (populatio) icreases. 7 I New Mexico & Texas sample NM TX 46% 1 54% 0 37% 1 63% 0 Democrat Democrat SD =.47.54.50 SD =.37.63.48 SE = (.50) SE = (.48) Because SE is early the same i both states, the estimated percet of democrats i a radom sample of 2,500 voters will be about the same, give or take about the same %. 8 Exercise Set C (page 370) #2, 3, 4, 5 #3. A survey orgaizatio wats to take a simple radom sample i order to estimate the percetage of people p who have see a certai televisio program. To keep the costs dow, they wat to take as small a sample as possible. But their cliet will oly tolerate chace errors of 1 percetage poit or so i the estimate. Should they use a sample of size 100, 2,500, or 10,000? You may assume the populatio to be very large; past experiece suggests the populatio percetage will be i the rage 20%-40%. 9 3
40% 1 60% 0 avg = p =.40 SD =.40.60 =.489 ~.50 10 Wat SE(%) ~1.00 p(1 p 1.00 = 100.40.60 1.00 = 100.40.60 1.00 = (10,000) = (.24)(10,000) ~ 2,400 Use sample of about 2,500 11 #5. 2 R 8 B p(r)=.20 Draw 4 marbles with replacemet, ad without replacemet. Fid SE(%) of red marbles draw. (a) With replacemet.2.8 SE(%) = 100 = 20% 4 12 4
(b) Without replacemet use correctio factor SE(%) = N SE(%) N 1 SE(%) = 10 4 (20) = (.816)(20) = 16.33% 10 1 Note: A big differece i SE(%) whe size of populatio is small. 13 E. The Gallup Poll They sample several thousad voters out of over 200 millio reaso? The size of the chace error (SE) i percet depeds maily o the absolute size of the sample, ad hardly at all o the size of the populatio from which the sample was selected. 14 For a sample of size 2,500 whe sample percet is betwee.40 ad.60, the p(1 p) SE(%) = 100.5.5 SE (%) = 100 = 1.00% 2,500 The chace error for a sample of size 2,500 from a populatio of over 200,000 millio would have a SE of percet aroud 1%. If sample percet is 50%, we would coclude results are accurate, give or take 1%. 15 5
F. Review Exercises (pp. 371-373) #1, 2, 3, 5, 11, 12 16 E(H) SE(H) E(% Hds) SE(% Hds) =2,500 1,250 25 50 1.0% =10,000 5,000 35 50.5% =1,000,000 500,000 500 50.05% 17 18 6
3. A group of 50,000 tax forms has a average gross icome of $37,000 with a SD of $20,000. Furthermore, 20% of the forms have a gross icome over $50,000. A group of 900 forms is chose at radom for audit. To estimate the chace that betwee 19% ad 21% of the forms chose for audit have gross icomes over $50,000, a box model is eeded. 19 a) Should the umber of tickets i the box be 900 or 50,000? b) Each ticket i the box shows a zero or a oe a gross icome c) True or false: The SE of the box is $20,000. d) True or false: The umber of draws is 900. e) Fid the chace (approximately) that betwee 19% ad 21% of the forms chose for audit have gross icomes over $50,000. 20 20% + 50k 80% - 50k = 900 E(% over 50k) = 20%.2.8 SE(% over 50k) = 100 = 1.33% 900 21 7
SE=1.33% 19 21 M=20% 19 20 21 20 Z = =.75 Z = = +. 75 1.33 1.33 % X M Z = SE Use Normal Curve table to fid percetage of scores betwee ±Z=.75. P (.75 Z.75) =.5467 or 54.67 ~ 55% 22 #11. (page 373) A uiversity has 25,000 studets, of whom 17,000 are udergraduates. The housig office takes a simple radom sample of 500 studets, 357 out of 500 are udergraduates. 17,000 1 8,000 0 17,000 p = =.68 E(umber) = 340 25,000 E(% of udergraduates) = 68% SE(% of udergraduates) = SE(%) = 2.09%.68.32 100 500 23 Determie the likelihood of obtaiig 357 or more udergraduates (71%) i a sample of 500 studets. SE=2.09 % M=68 71 Z 71 68 Z = = 1.44 ~1.45 2.09 24 8
85.29% -1.45 1.45 0 About 85.29% of scores i Normal Distributio are betwee ±Z=1.45. About 14.71% of scores are either less tha Z=-1.45 or greater tha Z=1.45. About 7.35% of scores are greater tha Z=1.45. Result: The probability (chace) of obtaiig 357 udergrads or more i a sample of 500 studets is about 7%. Z 25 9