Section 6.3b The Binomial Distribution

Similar documents
Binomial Random Variable - The count X of successes in a binomial setting

Chapter 8: Binomial and Geometric Distributions

Chapter 6 Section 3: Binomial and Geometric Random Variables

***SECTION 8.1*** The Binomial Distributions

MA 1125 Lecture 12 - Mean and Standard Deviation for the Binomial Distribution. Objectives: Mean and standard deviation for the binomial distribution.

Lecture 3. Sampling distributions. Counts, Proportions, and sample mean.

chapter 13: Binomial Distribution Exercises (binomial)13.6, 13.12, 13.22, 13.43

The binomial distribution p314

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

Section Introduction to Normal Distributions

Random Variables CHAPTER 6.3 BINOMIAL AND GEOMETRIC RANDOM VARIABLES

8.1 Binomial Distributions

Review. What is the probability of throwing two 6s in a row with a fair die? a) b) c) d) 0.333

Chapter 8. Binomial and Geometric Distributions

Chapter 8.1.notebook. December 12, Jan 17 7:08 PM. Jan 17 7:10 PM. Jan 17 7:17 PM. Pop Quiz Results. Chapter 8 Section 8.1 Binomial Distribution

The Binomial Probability Distribution

CHAPTER 4 DISCRETE PROBABILITY DISTRIBUTIONS

Chapter 6: Random Variables

Examples: Random Variables. Discrete and Continuous Random Variables. Probability Distributions

STAT Chapter 5: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.

Chapter 7: Sampling Distributions Chapter 7: Sampling Distributions

Math 160 Professor Busken Chapter 5 Worksheets

Sampling Distributions For Counts and Proportions

Problem Set 07 Discrete Random Variables

Section 6.3 Binomial and Geometric Random Variables

Chapter 6: Random Variables. Ch. 6-3: Binomial and Geometric Random Variables

Name Period AP Statistics Unit 5 Review

Lesson 97 - Binomial Distributions IBHL2 - SANTOWSKI

Opening Exercise: Lesson 91 - Binomial Distributions IBHL2 - SANTOWSKI

Study Guide: Chapter 5, Sections 1 thru 3 (Probability Distributions)

the number of correct answers on question i. (Note that the only possible values of X i

4.1 Probability Distributions

MA131 Lecture 9.1. = µ = 25 and σ X P ( 90 < X < 100 ) = = /// σ X

The Normal Approximation to the Binomial Distribution

Math 227 Elementary Statistics. Bluman 5 th edition

7. For the table that follows, answer the following questions: x y 1-1/4 2-1/2 3-3/4 4

STOR 155 Introductory Statistics (Chap 5) Lecture 14: Sampling Distributions for Counts and Proportions

Binomial Probabilities The actual probability that P ( X k ) the formula n P X k p p. = for any k in the range {0, 1, 2,, n} is given by. n n!

AP Statistics Chapter 6 - Random Variables

MA 1125 Lecture 14 - Expected Values. Wednesday, October 4, Objectives: Introduce expected values.

Probability Models. Grab a copy of the notes on the table by the door

STAT Chapter 5: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.

Midterm Exam III Review

Math 14 Lecture Notes Ch The Normal Approximation to the Binomial Distribution. P (X ) = nc X p X q n X =

CHAPTER 6 Random Variables

(c) The probability that a randomly selected driver having a California drivers license

#MEIConf2018. Before the age of the Calculator

A random variable is a quantitative variable that represents a certain

Example 1: Identify the following random variables as discrete or continuous: a) Weight of a package. b) Number of students in a first-grade classroom

CHAPTER 5 SAMPLING DISTRIBUTIONS

The normal distribution is a theoretical model derived mathematically and not empirically.

Statistics Chapter 8

Statistical Methods in Practice STAT/MATH 3379

STAT 3090 Test 2 - Version B Fall Student s Printed Name: PLEASE READ DIRECTIONS!!!!

Let X be the number that comes up on the next roll of the die.

A useful modeling tricks.

STAT 201 Chapter 6. Distribution

EXERCISES ACTIVITY 6.7

A random variable is a (typically represented by ) that has a. value, determined by, A probability distribution is a that gives the

MAKING SENSE OF DATA Essentials series

Chapter 5: Discrete Probability Distributions

2.) What is the set of outcomes that describes the event that at least one of the items selected is defective? {AD, DA, DD}

6.3: The Binomial Model

1 / * / * / * / * / * The mean winnings are $1.80

Lecture 9. Probability Distributions. Outline. Outline

A continuous random variable is one that can theoretically take on any value on some line interval. We use f ( x)

Business Statistics 41000: Probability 4

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

Business Statistics. Chapter 5 Discrete Probability Distributions QMIS 120. Dr. Mohammad Zainal

Probability. An intro for calculus students P= Figure 1: A normal integral

Statistics 6 th Edition

Lecture 9. Probability Distributions

Chapter 5 Student Lecture Notes 5-1. Department of Quantitative Methods & Information Systems. Business Statistics

What is the probability of success? Failure? How could we do this simulation using a random number table?

Simple Random Sample

Exercises for Chapter (5)

1. (9; 3ea) The table lists the survey results of 100 non-senior students. Math major Art major Biology major

Chapter 4 Probability Distributions

Chapter 7 1. Random Variables

Consider the following examples: ex: let X = tossing a coin three times and counting the number of heads

Chapter 8: The Binomial and Geometric Distributions

5.3 Statistics and Their Distributions

Solutions for practice questions: Chapter 15, Probability Distributions If you find any errors, please let me know at

guessing Bluman, Chapter 5 2

The Central Limit Theorem: Homework

STAT 3090 Test 2 - Version B Fall Student s Printed Name: PLEASE READ DIRECTIONS!!!!

MA 1125 Lecture 18 - Normal Approximations to Binomial Distributions. Objectives: Compute probabilities for a binomial as a normal distribution.

NOTES: Chapter 4 Describing Data

* Source:

Chapter 12. Binomial Setting. Binomial Setting Examples

The Binomial and Geometric Distributions. Chapter 8

3.2 Binomial and Hypergeometric Probabilities

The Central Limit Theorem: Homework

Lecture 7 Random Variables

Statistics (This summary is for chapters 18, 29 and section H of chapter 19)

Probability Distributions: Discrete

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

Lecture 6 Probability

Section The Sampling Distribution of a Sample Mean

We use probability distributions to represent the distribution of a discrete random variable.

Transcription:

We have seen in the previous investigation that binomial distributions can have different shapes. The distributions can range from approximately normal to skewed left or skewed right. Remember that when we describe a distribution, two of the characteristics of the distribution that we are interested in are center and spread like the mean and the standard deviation. Since a binomial distribution is a discrete random variable we could calculate the mean and standard deviation using known formulas. 1. Recall the table for the binomial distribution from the previous investigation for the number of girls in a family of four. Number of Girls (x) Theoretical Probability P(x) 0 1 2 3 4 0.0677 0.2600 0.3747 0.2400 0.0576 Using the formulas for the mean and variance of a discrete probability distribution (and your graphing calculator) find the mean and standard deviation for this binomial distribution. x x The formulas used to find the mean and standard deviation for a discrete random variable work fine but can get tedious especially if n is large. For binomial distributions the formulas for the mean and variance turn out to be quite simple. We will derive these formulas in the next activity. 2. We will start with a simple binomial setting where the random variable x has two outcomes 0 or 1. We will consider 1 a success and 0 a failure. Let be the probability of success and q the probability of failure. (Note that q 1 ). The probability distribution for this variable is given in the table below. x 0 1 P(x) 1 a. Calculate the mean, variance, and standard deviation for this distribution. 2 Page 1 of 20

b. Now define a new random variable y, where y is the count of the number of successes in n independent observations. That is y x1 x2 x3... xn, where each x i is a binomial random variable with the same distribution given in the previous table. Using the formulas for the means and variances of linear combinations of independent random variables find the mean, variance and standard deviation for y. y 2 y y Page 2 of 20

Mean and Standard Deviation of a Binomial Random Variable If a count X has the binomial distribution with n trials and the probability of success, then the mean and the standard deviation of X are n x n(1 ) x 3. Use the formulas for the mean and standard deviation of a binomial random variable to calculate the mean and standard deviation for the distribution of the count of the number of girls in a family of four children where the probability of a girl is 0.49. Show work. 4. Recall the TV purchasing problem from the previous investigation. Eighty percent of all TVs sold by a large retailer are plasma high definition (HDTV) and twenty percent are HD light emitting diode LCD TVs. The type of TV purchased by each of the next 12 customers will be noted. Let x be the random variable the number of plasma HDTVs purchased by these 12 customers. Find the mean and the standard deviation for this distribution. Show work. Page 3 of 20

Binomial Distributions in Statistical Sampling 5. In an attempt to increase sales, a breakfast cereal company decides to offer a NASCAR promotion. Each box of cereal will contain a collectible card featuring one of these NASCAR drivers: Jeff Gordon, Dale Earnhardt, Jr., Tony Stewart, Danica Patrick, or Jimmie Johnson. The company says that each of the 5 cards is equally likely to appear in any box of cereal. Suppose that the company printed 20,000 of each of the 5 cards so that there are 100,000 cereal boxes on the market that contain a card. A NASCAR fan bought 6 boxes of cereal at random, what is the probability of not getting the Danica Patrick card? Let s explore how we might use a binomial distribution to answer this question. a. What is considered a success in this situation? Define a random variable X to represent the number of successes. b. What is the probability of success? Is this a binomial setting? Why or why not? Since we are sampling without replacement (we didn t put the cereal box back on the shelf), the trials (buying cereal boxes) are not independent so the distribution is not technically binomial. But it is close. Consider that the probability of success for the first trial (not getting the Danica Patrick card) is 80,000 0.8 100,000. Since we are not doing replacement the probability of success changes to 79,999 0.7999 still pretty close to 99,999 0.8. This is due to the large population that we are drawing from. So for all 6 trials the probability of success does change but not significantly so. Thus we can consider this close to a binomial setting and thus use the rules to calculate the probability of 6 successes in 6 trials. c. Assuming that X is a binomial random variable find PX ( 6). Page 4 of 20

d. Calculate P(no Danica Patrick card) using the general multiplication rule (since the events are not independent) and compare this to the result you got in part c. Let s summarize when we can use a binomial distribution to calculate probabilities. Sampling Without Replacement Condition When taking an SRS of size n from a population of size N, we can use a binomial distribution to model the 1 count of successes (X) in the sample as long as n N ie. The population is at least ten times larger than the 10 sample. 6. Hiring Discrimination It Just Won t Fly! Sampling without replacement An airline has just finished training 25 first officers 15 male and 10 female to become captains. Unfortunately, only eight captain positions are available. Airline managers decide to use a lottery to determine which pilots will fill the available positions. Of the 8 captains chosen, 5 are female and 3 are male. a. Explain why the probability that 5 female pilots are chosen in a fair lottery is NOT PX 8 5 5 3 ( 5) (0.40) (0.6) 0.124 b. What is the correct probability that there are 5 female pilots selected? Page 5 of 20

7. Are attitudes toward shopping changing? Suppose that 60% of all adult U.S. Residents shop for gifts on the internet. Determine the probability that 1520 or more of a random sample of 2500 adult U.S. Residents shop for gifts online. a. Justify that this situation represents a binomial setting. b. Define a random variable for this problem and indicate the value for. c. In terms of your random variable express the probability that 1520 or more of the sample shopped on line using appropriate statistical notation. d. Find the probability in part c. 2500 1520 980 e. Find PX ( 1520) using the formula PX ( 1520) (0.6) (0.4). Did you encounter any 1520 difficulties? Find this probability using the built in functions for binomial distributions on your calculator. f. Find Px ( 1520) using binomcdf if you haven t already. Page 6 of 20

g. By using binomcdf you found that Px ( 1520) 0.2131. We have seen that binomial distributions can take on various shapes. Sometimes they are approximately normal. Suppose that the histogram below represents the results of the sample survey. Notice that this binomial distribution is approximately normal. Find the mean and standard deviation for this binomial distribution and then approximate Px ( 1520) using a normal distribution assuming it has the same mean and standard deviation as the binomial distribution. Compare this result with the value from part f. Page 7 of 20

8. From the previous exercise we see that if the binomial distribution is approximately normal we can use normal distribution calculations to approximate binomial distribution calculations. But when is a binomial distribution normal enough to do this? Recall the TV problem where eighty percent of all TVs sold by a large retailer are plasma high definition (HDTV) and twenty percent are HD light emitting diode LCD TVs. The type of TV purchased by each of the next 12 customers was noted and x was the random variable the number of plasma HDTVs purchased by these 12 customers. Recall that for this problem situation 0.8. The probability distribution table and histogram for x is shown below. Clearly this distribution is skewed left so a normal distribution approximation should not be used. But since n 12 is a small number of observations we have no difficulty calculating probabilities using the formulas for a binomial distribution or by using binomial calculations on your calculator. HDTV Purchase 1 2 3 4 5 6 7 8 9 10 11 12 13 x Px 0 4.096e-09 1 1.96608e-07 2 4.32538e-06 3 5.76717e-05 4 0.000519045 5 0.00332189 6 0.0155021 7 0.0531502 8 0.132876 9 0.236223 10 0.283468 11 0.206158 12 0.0687195 As you have seen when a binomial distribution is approximately normal it is reasonable to use a normal distribution to approximate desired probabilities. Since calling a distribution approximately normal is a judgment call on the part of the observer we will need criteria to establish when it is appropriate to use a normal distribution to approximate probabilities for a binomial distribution. 9. Open the Fathom file Binomial vs Normal. You should see a screen similar to the one shown below. Page 8 of 20

The histogram is the binomial distribution B (10,0.932). This distribution is clearly skewed left and thus a normal distribution is not an appropriate model. The curve drawn in the figure is a normal distribution and as you can see this curve just doesn t quite fit the histogram. a. Draw a more appropriate curve for this histogram in the figure above. b. Using the slider bars, change the value of and note when the distribution histogram appears most normal. c. Now change back to 0.932 and move the slider bar for n to increase it s value. What happens to the shape of the binomial distribution as n increases in value? Based on the results in activity 7, it is clear that the shape of a binomial distribution is dependent on and n. As a rule of thumb we can use a normal distribution to approximate a binomial distribution when n 10 and n (1 ) 10. When these conditions are met we may then safely use N n, n (1 ) as the normal approximation for the binomial distribution. 10. a. Do the conditions for the TV purchasing problem from activity 8 allow you to use a normal distribution? Justify your response. b. Do the conditions for the shopping problem in activity 7 allow you to use a normal distribution? Justify your response. Page 9 of 20

11. One way of checking the effect of undercoverage, nonresponse, and other sources of error in a sample survey is to compare the sample with known facts about the population. About 12% of American adults identify themselves as black. Suppose we take a SRS of 1500 American adults and let X be the number of blacks in the sample. a. Justify that this is a binomial setting. b. Check the conditions for using a normal approximation in this setting. c. Calculate and interpret P(165 X 195) in the context of this problem situation. using the normal approximation. d. Calculate P(165 X 195) using binomcdf on our calculator and compare it to your answer in part c. Page 10 of 20

We have seen in the previous investigation that binomial distributions can have different shapes. The distributions can range from approximately normal to skewed left or skewed right. Remember that when we describe a distribution, two of the characteristics of the distribution that we are interested in are center and spread like the mean and the standard deviation. Since a binomial distribution is a discrete random variable we could calculate the mean and standard deviation using known formulas. 1. Recall the table for the binomial distribution from the previous investigation for the number of girls in a family of four. Number of Girls (x) Theoretical Probability P(x) 0 1 2 3 4 0.0677 0.2600 0.3747 0.2400 0.0576 Using the formulas for the mean and variance of a discrete probability distribution (and your graphing calculator) find the mean and standard deviation for this binomial distribution. x x The formulas used to find the mean and standard deviation for a discrete random variable work fine but can get tedious especially if n is large. For binomial distributions the formulas for the mean and variance turn out to be quite simple. We will derive these formulas in the next activity. 2. We will start with a simple binomial setting where the random variable x has two outcomes 0 or 1. We will consider 1 a success and 0 a failure. Let be the probability of success and q the probability of failure. (Note that q 1 ). The probability distribution for this variable is given in the table below. x 0 1 P(x) 1 a. Calculate the mean, variance, and standard deviation for this distribution. 2 Page 11 of 20

b. Now define a new random variable y, where y is the count of the number of successes in n independent observations. That is y x1 x2 x3... xn, where each x i is a binomial random variable with the same distribution given in the previous table. Using the formulas for the means and variances of linear combinations of independent random variables find the mean, variance and standard deviation for y. y 2 y y Page 12 of 20

Mean and Standard Deviation of a Binomial Random Variable If a count X has the binomial distribution with n trials and the probability of success, then the mean and the standard deviation of X are n x n(1 ) x 3. Use the formulas for the mean and standard deviation of a binomial random variable to calculate the mean and standard deviation for the distribution of the count of the number of girls in a family of four children where the probability of a girl is 0.49. Show work. 4. Recall the TV purchasing problem from the previous investigation. Eighty percent of all TVs sold by a large retailer are plasma high definition (HDTV) and twenty percent are HD light emitting diode LCD TVs. The type of TV purchased by each of the next 12 customers will be noted. Let x be the random variable the number of plasma HDTVs purchased by these 12 customers. Find the mean and the standard deviation for this distribution. Show work. Page 13 of 20

Binomial Distributions in Statistical Sampling 5. In an attempt to increase sales, a breakfast cereal company decides to offer a NASCAR promotion. Each box of cereal will contain a collectible card featuring one of these NASCAR drivers: Jeff Gordon, Dale Earnhardt, Jr., Tony Stewart, Danica Patrick, or Jimmie Johnson. The company says that each of the 5 cards is equally likely to appear in any box of cereal. Suppose that the company printed 20,000 of each of the 5 cards so that there are 100,000 cereal boxes on the market that contain a card. A NASCAR fan bought 6 boxes of cereal at random, what is the probability of not getting the Danica Patrick card? Let s explore how we might use a binomial distribution to answer this question. a. What is considered a success in this situation? Define a random variable X to represent the number of successes. b. What is the probability of success? Is this a binomial setting? Why or why not? Since we are sampling without replacement (we didn t put the cereal box back on the shelf), the trials (buying cereal boxes) are not independent so the distribution is not technically binomial. But it is close. Consider that the probability of success for the first trial (not getting the Danica Patrick card) is 80,000 0.8 100,000. Since we are not doing replacement the probability of success changes to 79,999 0.7999 still pretty close to 99,999 0.8. This is due to the large population that we are drawing from. So for all 6 trials the probability of success does change but not significantly so. Thus we can consider this close to a binomial setting and thus use the rules to calculate the probability of 6 successes in 6 trials. c. Assuming that X is a binomial random variable find PX ( 6). Page 14 of 20

d. Calculate P(no Danica Patrick card) using the general multiplication rule (since the events are not independent) and compare this to the result you got in part c. Let s summarize when we can use a binomial distribution to calculate probabilities. Sampling Without Replacement Condition When taking an SRS of size n from a population of size N, we can use a binomial distribution to model the 1 count of successes (X) in the sample as long as n N ie. The population is at least ten times larger than the 10 sample. 6. Hiring Discrimination It Just Won t Fly! Sampling without replacement An airline has just finished training 25 first officers 15 male and 10 female to become captains. Unfortunately, only eight captain positions are available. Airline managers decide to use a lottery to determine which pilots will fill the available positions. Of the 8 captains chosen, 5 are female and 3 are male. a. Explain why the probability that 5 female pilots are chosen in a fair lottery is NOT PX 8 5 5 3 ( 5) (0.40) (0.6) 0.124 b. What is the correct probability that there are 5 female pilots selected? Page 15 of 20

7. Are attitudes toward shopping changing? Suppose that 60% of all adult U.S. Residents shop for gifts on the internet. Determine the probability that 1520 or more of a random sample of 2500 adult U.S. Residents shop for gifts online. a. Justify that this situation represents a binomial setting. b. Define a random variable for this problem and indicate the value for. c. In terms of your random variable express the probability that 1520 or more of the sample shopped on line using appropriate statistical notation. d. Find the probability in part c. 2500 1520 980 e. Find PX ( 1520) using the formula PX ( 1520) (0.6) (0.4). Did you encounter any 1520 difficulties? Find this probability using the built in functions for binomial distributions on your calculator. f. Find Px ( 1520) using binomcdf if you haven t already. Page 16 of 20

g. By using binomcdf you found that Px ( 1520) 0.2131. We have seen that binomial distributions can take on various shapes. Sometimes they are approximately normal. Suppose that the histogram below represents the results of the sample survey. Notice that this binomial distribution is approximately normal. Find the mean and standard deviation for this binomial distribution and then approximate Px ( 1520) using a normal distribution assuming it has the same mean and standard deviation as the binomial distribution. Compare this result with the value from part f. Page 17 of 20

8. From the previous exercise we see that if the binomial distribution is approximately normal we can use normal distribution calculations to approximate binomial distribution calculations. But when is a binomial distribution normal enough to do this? Recall the TV problem where eighty percent of all TVs sold by a large retailer are plasma high definition (HDTV) and twenty percent are HD light emitting diode LCD TVs. The type of TV purchased by each of the next 12 customers was noted and x was the random variable the number of plasma HDTVs purchased by these 12 customers. Recall that for this problem situation 0.8. The probability distribution table and histogram for x is shown below. Clearly this distribution is skewed left so a normal distribution approximation should not be used. But since n 12 is a small number of observations we have no difficulty calculating probabilities using the formulas for a binomial distribution or by using binomial calculations on your calculator. HDTV Purchase 1 2 3 4 5 6 7 8 9 10 11 12 13 x Px 0 4.096e-09 1 1.96608e-07 2 4.32538e-06 3 5.76717e-05 4 0.000519045 5 0.00332189 6 0.0155021 7 0.0531502 8 0.132876 9 0.236223 10 0.283468 11 0.206158 12 0.0687195 As you have seen when a binomial distribution is approximately normal it is reasonable to use a normal distribution to approximate desired probabilities. Since calling a distribution approximately normal is a judgment call on the part of the observer we will need criteria to establish when it is appropriate to use a normal distribution to approximate probabilities for a binomial distribution. 9. Open the Fathom file Binomial vs Normal. You should see a screen similar to the one shown below. Page 18 of 20

The histogram is the binomial distribution B (10,0.932). This distribution is clearly skewed left and thus a normal distribution is not an appropriate model. The curve drawn in the figure is a normal distribution and as you can see this curve just doesn t quite fit the histogram. a. Draw a more appropriate curve for this histogram in the figure above. b. Using the slider bars, change the value of and note when the distribution histogram appears most normal. c. Now change back to 0.932 and move the slider bar for n to increase it s value. What happens to the shape of the binomial distribution as n increases in value? Based on the results in activity 7, it is clear that the shape of a binomial distribution is dependent on and n. As a rule of thumb we can use a normal distribution to approximate a binomial distribution when n 10 and n (1 ) 10. When these conditions are met we may then safely use N n, n (1 ) as the normal approximation for the binomial distribution. 10. a. Do the conditions for the TV purchasing problem from activity 8 allow you to use a normal distribution? Justify your response. b. Do the conditions for the shopping problem in activity 7 allow you to use a normal distribution? Justify your response. Page 19 of 20

11. One way of checking the effect of undercoverage, nonresponse, and other sources of error in a sample survey is to compare the sample with known facts about the population. About 12% of American adults identify themselves as black. Suppose we take a SRS of 1500 American adults and let X be the number of blacks in the sample. a. Justify that this is a binomial setting. b. Check the conditions for using a normal approximation in this setting. c. Calculate and interpret P(165 X 195) in the context of this problem situation. using the normal approximation. d. Calculate P(165 X 195) using binomcdf on our calculator and compare it to your answer in part c. Page 20 of 20