Lecture 5: Samplig Distributio Readigs: Sectios 5.5, 5.6 Itroductio Parameter: describes populatio Statistic: describes the sample; samplig variability Samplig distributio of a statistic: A probability distributio that characterize the samplig variability The distributio of values take by a sample statistic, e.g., sample mea, sample proportio, across all possible samples of the same size from the populatio Imagie takig repeated samples of size from the populatio ad calculatig sample statistic for each of them. Example 1: The lifetime of a certai battery follows a expoetial distributio with a mea value of 10 hours, i.e. f(x) = 0.1e 0.1x (x > 0). Sample 1 2 3... Sample Mea x 1 = 8.2 x 2 = 16.4 x 3 = 7.1... Example 2: To check what percet of the resistors maufactured by a certai firm have resistaces that exceed 10 ohms (p), we take repeated radom samples of 100 resistors ad compute the proportio of resistors with resistaces exceedig 10 ohms (ˆp). Sample 1 2 3... Sample proportio ˆp 1 = 0.03 ˆp 2 = 0.02 ˆp 3 = 0.05... How are these sample statistics distributed? Geeral Properties of Samplig Distributio 1. The samplig distributio of the statistic is cetered at the populatio parameter estimated by the statistic. 2. The spread of the samplig distributio of the statistic decreases as the sample size icreases 3. As the sample size icreases, the samplig distributio becomes icreasigly Normally distributed Samplig Distributio of a Sample Mea Sample mea X = 1 (X 1 + X 2 +... + X ) 1
If the populatio distributio has mea ad stadard deviatio σ, the the mea ad stadard deviatio of X are E( X) = X =, SD( X) = σ X = σ If the populatio distributio is a ormal distributio, the the samplig distributio of X is also ormal distributio for ay sample size What is the populatio distributio is ot ormal? Cetral Limit Theorem: The samplig distributio of X ca be approximated by a ormal distributio, N( X =, σ X = σ ), whe the sample size is sufficietly large, irrespective of the shape of the populatio distributio. The larger the sample size, the better the approximatio =1 =4 =20 =1 =4 =20 Example 3: I a certai populatio of fish, the legths of idividual fish follow a ormal distributio with mea 54mm ad stadard deviatio 4.5mm. a. What is the probability that a radomly chose fish is betwee 51mm ad 60mm log? b. What is the probability that the mea legth of the 4 radomly chose fish is betwee 51mm ad 60mm? 2
Example 4: The lifetime of a certai battery follows a expoetial distributio with a mea value of 10 hours, i.e. f(x) = 0.1e 0.1x (x > 0). You take a radom sample of 100 batteries. What is the probability that the average lifetime would be betwee 9 ad 11 hours? Example 5: A surveyig istrumet makes a error of -2, -1, 0, 1, or 2 feet with equal probabilities whe measurig the height of a 200-foot tower. Fid the probability that, i 18 idepedet measuremets of the tower, the average of the measuremets is betwee 199 ad 201 feet. 3
Samplig Distributio of a Sample Proportio Recall i the biomial experimet, X = the umber of successes i trials ad X biomial(, p), where p is the populatio proportio of successes. Cosider the sample proportio ˆp = X/ as a estimate of the populatio proportio p. The samplig distributio of ˆp has the same shape as the distributio of X, but o a o-iteger scale sice P (ˆp = p ) = P (X/ = p ) = P (X = p ). Example 6: The maufacturer admits that St. Joh s Wort Capsules have 2% probability of beig defective (cotaiig the wrog amout of active igrediet). Select a radom sample of 40 capsules. Let Y be the umber of defective capsules i the sample, Y biomial( = 40, p = 0.02). Let ˆp = Y/40 be the fractio of defective capsules. Y ˆp Probability 0 0.000 0.446 1 0.025 0.364 2 0.050 0.145 3 0.075 0.037 4 0.100 0.007 5 0.125 0.001 4
a. What is the probability that ˆp is withi 0.02 of the true value? b. If we select a radom sample of 100 capsules, what is the probability that ˆp is withi 0.02 of the true value? We have leared that biomial radom variables ca be approximated usig ormal variables: If X Biomial(, p), the X approx. whe is large (p > 5 ad (1 p) > 5). What about ˆp whe is large? N( = p, σ = p(1 p)) The mea ad stadard deviatio of the samplig distributio of ˆp are p(1 p) ˆp = p, σˆp = For a sufficietly large sample size, the samplig distributio of ˆp is approximately ormal. That is, ˆp approx. p(1 p) N(ˆp = p, σˆp = ) Rule of thumb: use the ormal approximatio whe p > 5 ad (1 p) > 5. Cotiuity correctio: add or subtract 0.5 to improve accuracy. Note that the mea ad stadard deviatio of ˆp deped o the true populatio proportio p, which we ofte do t kow. Use a value based o theory Use p = 0.5 i the formula 5
Example 7: 8% of the populatio are left-haded. If 100 people are radomly selected, what is the probability that less tha 5% of the people sampled will be left-haded? 6
Example 8: Assume that 10% of a certai product maufactured by a firm is defective. How large a sample is eeded to be at least 80% certai that the proportio of defective products is betwee 7% ad 13%? 7