STOR 155 Introductory Statistics (Chap 5) Lecture 14: Sampling Distributions for Counts and Proportions

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL STOR 155 Introductory Statistics (Chap 5) Lecture 14: Sampling Distributions for Counts and Proportions 5/31/11 Lecture 14 1

Statistic & Its Sampling Distribution A statistic is any numeric measure calculated from data. It is a random variable, its value varies from sample to sample. count/proportion: Ex: number/proportion of free throws made by a Tar Heel player who shoots 20 free throws in a practice sample mean Ex: average SAT score of a group of 10 students randomly selected from STAT 155 The probability distribution of a statistic is called its sampling distribution. It depends on the population distribution, and the sample size. 5/31/11 Lecture 14 2

Binomial Experiment n trials (with n fixed in advance). Each trial has two possible outcomes, success (S) and failure (F). The probability of success, p, remains the same from one trial to the next. The trials are independent, i.e. the outcome of each trial does not affect outcomes of other trials. 5/31/11 Lecture 14 3

Example The experiment: randomly draw n balls with replacement from an urn containing 10 red balls and 20 black balls. Let S represent {drawing a red ball} and F represent {drawing a black ball}. Then this is a binomial experiment with p =1/3. Q: Would it still be a binomial experiment if the balls were drawn without replacement? No! 5/31/11 Lecture 14 4

Binomial Distribution 5/31/11 Lecture 14 5

Do they follow binomial distributions (approximately)? X = number of stocks on the NY stock exchange whose prices increase today X = number of games the Tar Heel will win next season A couple decides to have children until they have a girl. X = number of boys the couple will have Answer: NO in all 3 cases. Why? 5/31/11 Lecture 14 6

Binomial Distribution If X ~ B(n, p), then X = np, 2 X = np(1-p). P(X=x) depends on n and p, which can be calculated using software or Table C (for some n and p), or a Binomial Formula (page 329) --- a simple argument given in class 5/31/11 Lecture 14 7

Binomial Table: for n 20, and certain values of p. Table C: Page T-6 5/31/11 Lecture 14 8

Credit Card Example Records show that 5% of the customers in a shoe store make their payments using a credit card. This morning 8 customers purchased shoes. Use the binomial table to answer the following questions. 1. Find the probability that exactly 6 customers did not use a credit card. X: number of customers who did not use a credit card. Then X ~ B(8, 0.95), which is not on the table. Y: number of customers who did use a credit card. Then Y ~ B(8, 0.05), which is on the table. P(X= 6) = P(Y = 2) =.0515. 2. What is the probability that at least 3 customers used a credit card? (See the board ) 5/31/11 Lecture 14 9

Credit Card Example (continued) 3. What is the expected number of customers who used a credit card? Y = np = 8(.05) = 0.4. 4. What is the standard deviation of the number of customers who used a credit card? 2 Y = np (1 p) = 8(. 05)(.95) = 0.38. The standard deviation is Y 0.38 0.62. 5/31/11 Lecture 14 10

Parking Example (bad impact?) Sarah drives to work everyday, but does not own a parking permit. She decides to take her chances and risk getting a parking ticket each day. Suppose A parking permit for a week (5 days) cost $ 30. A parking fine costs $ 50. The probability of getting a parking ticket each day is 0.1. Her chances of getting a ticket each day is independent of other days. She can get only 1 ticket per day. What is her probability of getting at least 1 parking ticket in one week (5 days)? What is the expected number of parking tickets that Sarah will get per week? Is she better off paying the parking permit in the long run? 5/31/11 Lecture 14 11

Sample Proportion If X ~ B(n, p), the sample proportion is defined as X pˆ n # of successes sample size. mean & variance of a sample proportion: pˆ p, pˆ p(1 p) / n. 5/31/11 Lecture 14 12

Example: Clinton's vote 43% of the population voted for Clinton in 1992. Suppose we survey a sample of size 2300 and see if they voted for Clinton or not in 1992. We are interested in the sampling distribution of the sample proportion pˆ, for samples of size 2300. What's the mean and variance of pˆ? 5/31/11 Lecture 14 13

Count & Proportion of Success A Tar Heel basketball player is a 95% free throw shooter. Suppose he will shoot 5 free throws during each practice. X: number of free throws he makes in a practice. pˆ : proportion of free throws made during practice. pˆ P(X=3) = P( =0.6). Why? 5/31/11 Lecture 14 14

5/31/11 Lecture 14 15

Normal Approximation for Counts and Proportions Let X ~ B(n, p) and If n is large, then pˆ X / n. X is approx. N( np, np(1-p)) pˆ is approx. N( p, p(1-p) / n). Rule of Thumb: np 10, n(1 - p) 10. 5/31/11 Lecture 14 16

Switches Inspection A quality engineer selects an SRS of 100 switches from a large shipment for detailed inspection. Unknown to the engineer, 10% of the switches in the shipment fail to meet the specifications. Software tells us that the actual probability that no more than 9 of the switches in the sample fail inspection is P(X 9) = 0.4513. What will the normal approximation say? 5/31/11 Lecture 14 17

5/31/11 Lecture 14 18

Switches Inspection The normal approximation to the probability of no more than 9 bad switches is the area to the left of X = 9 under the normal curve. X np ( 100)(.1) 10, np(1 p) 100(.1)(.9) X 3. Using Table A, we have X 10 9 10 P( X 9) P( ) P( Z.33) 3 3.3707. The approximation.3707 to the binomial probability of.4513 is not very accurate. In this case np = 10. 5/31/11 Lecture 14 19

5/31/11 Lecture 14 20

Continuity Correction The normal approximation is more accurate if we consider X=9 to extend from 8.5 to 9.5, X = 10 to extend from 9.5 to 10.5, and so on. Example (Cont.): P( X 9) P( X P( Z X 10 9.5) P( 3.17).4325. 9.5 10 ) 3 5/31/11 Lecture 14 21

Continuity Correction P(X 8) replaced by P(X < 8.5) P(X 14) replaced by P(X > 13.5) P(X < 8) = P(X<=7) replaced by P(X < 7.5) For large n the effects of the continuity correction factor is very small and will be omitted. 5/31/11 Lecture 14 22

Coin Tossing Example Toss a fair coin 200 times, what is the probability that the total number of heads is between 90 and 110? X= the total number of heads X ~ B(200, 0.5). Want: P(90 X 110). X =200.5 = 100, X = (200.5.5) 1/2 = 7.07. With continuity correction: P(90 X 110) = P(X 110) - P(X 89) P( Z (110 +.5-100)/7.07) - P (Z (89 +.5-100)/7.07) = P (Z 1.48) - P (Z -1.48)=.8611. 5/31/11 Lecture 14 23

Normal Approximation for Sample Proportions Let X ~ B(n, p) and If n is large, then pˆ X / n. pˆ is approx. N( p, p(1-p) / n). Rule of Thumb: np 10, n(1 - p) 10. 5/31/11 Lecture 14 24

Example The Laurier company s brand has a market share of 30%. In a survey 1000 consumers were asked which brand they prefer. What is the probability that more than 32% of the respondents say they prefer the Laurier brand? Solution: The number of respondents who prefer Laurier is binomial with n = 1000 and p =.30. Also, np = 1000(.3) > 10, n(1-p) = 1000(1-.3) > 10. ˆ.32.30 ( ˆ.32) p p P p P P( Z 1.38) (1 ).01449 p p n.0838. 5/31/11 Lecture 14 25

Take Home Message Sampling distribution Binomial experiments Binomial distribution Binomial formula How to use Binomial Table Sample Proportion Normal approximation Continuity correction 5/31/11 Lecture 14 26