Probability Models. Grab a copy of the notes on the table by the door

Similar documents
Chapter 17 Probability Models

Chapter 8. Binomial and Geometric Distributions

Chapter 17. Probability Models. Copyright 2010 Pearson Education, Inc.

Chapter 8 Probability Models

STA 6166 Fall 2007 Web-based Course. Notes 10: Probability Models

The Binomial and Geometric Distributions. Chapter 8

Binomial Probabilities The actual probability that P ( X k ) the formula n P X k p p. = for any k in the range {0, 1, 2,, n} is given by. n n!

Chapter 6: Random Variables. Ch. 6-3: Binomial and Geometric Random Variables

Chapter 8: The Binomial and Geometric Distributions

Objective: To understand similarities and differences between geometric and binomial scenarios and to solve problems related to these scenarios.

Policyholder Outcome Death Disability Neither Payout, x 10,000 5, ,000

PROBABILITY AND STATISTICS CHAPTER 4 NOTES DISCRETE PROBABILITY DISTRIBUTIONS

Binomial Random Variable - The count X of successes in a binomial setting

MA 1125 Lecture 12 - Mean and Standard Deviation for the Binomial Distribution. Objectives: Mean and standard deviation for the binomial distribution.

***SECTION 8.1*** The Binomial Distributions

Chpt The Binomial Distribution

Section 6.3 Binomial and Geometric Random Variables

3. The n observations are independent. Knowing the result of one observation tells you nothing about the other observations.

3. The n observations are independent. Knowing the result of one observation tells you nothing about the other observations.

Binomial Distributions

The Binomial Distribution

4.1 Probability Distributions

the number of correct answers on question i. (Note that the only possible values of X i

Please have out... - notebook - calculator

Suppose a cereal manufacturer puts pictures of famous athletes on cards in boxes of

STAT 201 Chapter 6. Distribution

Chapter 5: Discrete Probability Distributions

Lesson 97 - Binomial Distributions IBHL2 - SANTOWSKI

Opening Exercise: Lesson 91 - Binomial Distributions IBHL2 - SANTOWSKI

Section 7.5 The Normal Distribution. Section 7.6 Application of the Normal Distribution

What is the probability of success? Failure? How could we do this simulation using a random number table?

2) There is a fixed number of observations n. 3) The n observations are all independent

Chapter 6 Section 3: Binomial and Geometric Random Variables

5.4 Normal Approximation of the Binomial Distribution

5.2 Random Variables, Probability Histograms and Probability Distributions

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

Random Variables CHAPTER 6.3 BINOMIAL AND GEOMETRIC RANDOM VARIABLES

Part 10: The Binomial Distribution

Chapter 3 - Lecture 5 The Binomial Probability Distribution

AP Statistics Ch 8 The Binomial and Geometric Distributions

Section Distributions of Random Variables

Chapter 4 and 5 Note Guide: Probability Distributions

Chapter 5 Probability Distributions. Section 5-2 Random Variables. Random Variable Probability Distribution. Discrete and Continuous Random Variables

Statistics Chapter 8

8.2 The Standard Deviation as a Ruler Chapter 8 The Normal and Other Continuous Distributions 8-1

STA258H5. Al Nosedal and Alison Weir. Winter Al Nosedal and Alison Weir STA258H5 Winter / 41

Probability Models.S2 Discrete Random Variables

Binomial Distribution. Normal Approximation to the Binomial

Discrete Probability Distributions

Mean of a Discrete Random variable. Suppose that X is a discrete random variable whose distribution is : :

Statistical Methods in Practice STAT/MATH 3379

Binomial Random Variables. Binomial Random Variables

Binomial formulas: The binomial coefficient is the number of ways of arranging k successes among n observations.

Lean Six Sigma: Training/Certification Books and Resources

Binomal and Geometric Distributions

4 Random Variables and Distributions

Chapter 14 - Random Variables

CHAPTER 4 DISCRETE PROBABILITY DISTRIBUTIONS

Lecture 8 - Sampling Distributions and the CLT

CH 5 Normal Probability Distributions Properties of the Normal Distribution

Math 243 Section 4.3 The Binomial Distribution

6. THE BINOMIAL DISTRIBUTION

Chapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi

1 / * / * / * / * / * The mean winnings are $1.80

CHAPTER 6 Random Variables

Probability Theory and Simulation Methods. April 9th, Lecture 20: Special distributions

Probability is the tool used for anticipating what the distribution of data should look like under a given model.

Mathacle. PSet Stats, Concepts In Statistics Level Number Name: Date: Distribution Distribute in anyway but normal

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

What do you think "Binomial" involves?

AP Statistics Quiz A Chapter 17

Chapter 6: Random Variables

When the observations of a quantitative random variable can take on only a finite number of values, or a countable number of values.

Chapter 5. Sampling Distributions

Chapter 3 Discrete Random Variables and Probability Distributions

STOR 155 Introductory Statistics (Chap 5) Lecture 14: Sampling Distributions for Counts and Proportions

CHAPTER 6 Random Variables

Discrete Random Variables and Probability Distributions. Stat 4570/5570 Based on Devore s book (Ed 8)

2011 Pearson Education, Inc

MA 1125 Lecture 14 - Expected Values. Wednesday, October 4, Objectives: Introduce expected values.

8.1 Binomial Distributions

DO NOT POST THESE ANSWERS ONLINE BFW Publishers 2014

BIOL The Normal Distribution and the Central Limit Theorem

MA : Introductory Probability

Chapter 6: Discrete Probability Distributions

184 Chapter Not binomial: Because the student receives instruction after incorrect answers, her probability of success is likely to increase.

Solutions for practice questions: Chapter 15, Probability Distributions If you find any errors, please let me know at

CHAPTER 6 Random Variables

Random Variables and Probability Functions

Consider the following examples: ex: let X = tossing a coin three times and counting the number of heads

Section Distributions of Random Variables

MidTerm 1) Find the following (round off to one decimal place):

Math Week in Review #10. Experiments with two outcomes ( success and failure ) are called Bernoulli or binomial trials.

Binomial and Geometric Distributions

Probability. An intro for calculus students P= Figure 1: A normal integral

Section Random Variables

Statistics. Marco Caserta IE University. Stats 1 / 56

A useful modeling tricks.

Chapter 3 Class Notes Intro to Probability

Overview. Definitions. Definitions. Graphs. Chapter 5 Probability Distributions. probability distributions

Transcription:

Grab a copy of the notes on the table by the door

Bernoulli Trials Suppose a cereal manufacturer puts pictures of famous athletes in boxes of cereal, in the hope of increasing sales. The manufacturer announces that 20% of the boxes contains a picture of LeBron James, 30% a picture of Damica Patrick, and the rest a picture of Serena Williams. It s possible to simulate the number of boxes we d need to open to get one of each card. That s a fairly complex question and one well suited for simulation. But many important questions can be answered more directly by using simple probability models. Stats: Modeling the World 4e 2

Bernoulli Trials Searching for LeBron Suppose you re a huge LeBron James fan. You don t care about completely the whole sport card collection, but you ve just got to have the LeBron s picture. How many boxes do you expect you ll have to open before you find him? This isn t the same question we asked before, but this situation is simple enough for a probability model. 3

Bernoulli Trials Searching for LeBron We ll keep the assumption that pictures are distributed at random and we ll trust the manufacturer s claim that 20% of the cards are Lebron. So, when you open the box, the probability that you succeed in finding Lebron is 0.20. 4

Bernoulli Trials Searching for LeBron Now we ll call the act of opening each box a trial, and note that: There are only two possible outcomes (called success and failure) on each trial. Getting What are LeBron those outcomes (success), for this not (failure) example? In advance, the probability of success, denoted p, is the same on every trail. Here p = 0.20 for each box. Finding LeBron in the first box does not change what might happen when you reach the next box. The What trials do we are call independent. these kinds of events? 5

Bernoulli Trials Searching for LeBron Situations like this occur often and are called Bernoulli trials. Common examples of Bernoulli trials include tossing a coin, looking for defective products rolling off an assembly line, or even shooting free throws in basketball games. Just as we found equally like random digits to be the building blocks for our simulation, we can use Bernoulli trials to build a wide variety of useful probability models. 6

Bernoulli Trials Searching for LeBron Back to finding LeBron. We want to know how many boxes we ll need to open to find his card. Let s call this random variable Y = # boxes, and build a probability model for it. What s the probability you find his picture in the first box of cereal? It s 20%, of course. We could write P(Y = 1) = 0.20. Why do you suppose Y = 1? Discuss with your partners Stats: Modeling the World 4e 7

Bernoulli Trials Searching for LeBron How about the probability you don t find LeBron until the second box? What does this mean? Well, that means you fail on the first trial and then succeed on the second. With the probability of success 20%, the probability of failure, denoted q, is 1 0.2 = 80%. Since the trials are independent, the probability of getting your first success on the second trial is P(Y = 2) = (0.8)(0.2) = 0.16 8

Bernoulli Trials Searching for LeBron Calvin and Hobbes 1993 Bill Watterson courtesy of http://www.gocomics.com/calvinandhobbes/2013/04/16 9

Bernoulli Trials Searching for LeBron Of course, you could have a run of bad luck. Maybe you won t find LeBron until the fifth box of cereal. What are the chances of that? You d have to fail 4 straight times and then succeed, so P(Y = 5) = (0.8) 4 (0.2) = 0.08192. 10

Bernoulli Trials Searching for LeBron How many boxes might you expect to have to open? We could reason that since LeBron s picture is in 20% of the boxes, or 1 in 5, we expect to find his picture, on average, in the fifth box; that is, E(Y) = 1 0.2 = 5 boxes. That s correct, but not easy to prove. 11

10% Rule One of the important requirement for Bernoulli trials is that the trials be independent. Sometimes that s a reasonable assumption when tossing a coin or rolling a die, for example. But that becomes a problem when (often!) we re looking at situations involving samples chosen without replacement. We said that whether we find a LeBron James card in one box has no effect on the probabilities in other boxes. This is almost true. 12

10% Rule Technically, if exactly 20% of the boxes had LeBron James cards, then when you find one, you re reduced the number of remaining LeBron cards. With a few million boxes of cereal, though, the difference is hardly worth mentioning. But if you knew there were 2 LeBron James cards hiding in the 10 boxes of cereal on the market shelf, then finding one in the first box you would clearly change your chances of finding his picture in the next box. 13

10% Rule If we had an infinite number of boxes, there wouldn t be a problem. It s selecting from a finite population that causes the probabilities to change, making the trials not independent. Obviously, taking 2 out of 10 boxes changes the probability. Taking even a few hundred out of millions, though, makes very little difference. Fortunately, it turns out that if we look at less than 10% of the population, we can pretend that the trials are independent and still calculate probabilities that are quite accurate. 14

That s our 10% rule of thumb: 10% Rule The 10% Condition: Bernoulli trials must be independent. If that assumption is violated, it is still okay to proceed as long as we randomly sample fewer than 10% of the population. There is a formula that can adjust for even larger samples, called the finite population correction, but it s beyond the scope of this course. 15

1. Have your calculator ready 2. Grab a copy of the notes on the table by the door 3. Yesterday s notes are on the table with the sharpener

A trial is Bernoulli if 1. There are two possible outcomes Bernoulli Trials & 10% Quick Review 2. The probability of success is constant 3. The trials are independent* What if the trial is not truly independent? So long as the random sample is smaller than 10% of the total population 17

The Geometric Model: Waiting for Success We want to model how long it will take to achieve the first success in a series of Bernoulli trials. The model tells us this probability is called the Geometric probability model. Geometric models are completely specified by one parameter, p, the probability of success, and are denoted by Geom(p). Since achieving the first success on trial number x required first experiencing x 1 failures, the probabilities are expressed by a formula. 18

The Geometric Model: Waiting for Success Geometric Probability Model for Bernoulli Trials: Geom(p) p = probability of success (and q = 1 p = probability of failure) X = number of trials until the first success occurs P(X = x) = q x-1 p Expected value: E X = π = 1 p *Standard deviation: σ = q p 2 19

The Geometric Model: Waiting for Success Example: Spam and the Geometric Model Postini is a global company specializing in communications security. The company monitors over 1 billion Internet messages per day and recently reported that 91% of e-mails are spam! Let s assume that your e-mail is typical 91% spam. We ll also assume you aren t using a spam filter, so every message gets dumped in your inbox (ugh!). And, since spam comes from many different sources, we ll consider your messages to be independent. Question: Overnight your inbox collects email. When you first check your email in the morning, about how many spam emails should you expect to have to wade through and discard before you find a real message? What s the probability that the 4 th message in your inbox is the first one that isn t spam? Stats: Modeling the World 4e 20

The Geometric Model: Waiting for Success Example: Spam and the Geometric Model Answer: When I check my emails one-by-one: There are two possible outcomes each time: a real message (success) or spam (failure) Since 91% of all emails are spam, the probability of success is p = 1 0.91 = 0.09 My messages arrive in random order from many different sources and are far fewer than 10% of all email messages. I can treat them as independent. 21

The Geometric Model: Waiting for Success Example: Spam and the Geometric Model Answer: Let X = the number of emails I ll check until I find a real message. I can use the model Geom(0.09). E X = 1 p = 1 0.09 = 11.1 P X = 4 = 0.91 3 0.09 = 0.0678 On average, I expect to have to check just over 11 emails before I find a real message. There s slightly less than a 7% chance that my first real message will be the 4 th one I check. Note that this probability calculation isn t new. It s simply the Multiplication Rule used to find P spam spam spam real 22 Stats: Modeling the World 4e

The Geometric Model: Waiting for Success Step-by-Step: Working with a Geometric Model People with O-negative blood are called universal donors because O- negative blood can be given to anyone else, regardless of the recipient s blood type. Only about 6% of people have O-negative blood. Questions: 1. If donors line up at random for a blood drive, how many do you expect to examine before you find someone who has O-negative blood? 2. What s the probability that the first O-negative donor found is one of the first four people in line? Stats: Modeling the World 4e 23

The Geometric Model: Waiting for Success Think Plan State the questions Variable Check to see that these are Bernoulli trials. Define the random variable I want to estimate how many people I ll need to check to find an O-negative donor, and the probability that 1 of the first 4 people is O-negative. There are two outcomes: success = O-negative and failure = other blood types The probability of success for each person is p = 0.06, because they are lined up randomly 10% Condition: Trials aren t independent because the population is finite, but the donors lined up are fewer than 1-% of all possible donors. Let X = number of donors until one is O-negative Model Specify the model Stats: Modeling the World 4e I can model X with Geom(0.06) 24

The Geometric Model: Waiting for Success Show Mechanics Find the mean Calculate the probability of success on one of the first four trials. That s the probability that X = 1, 2, 3, or 4 E X = 1 0.06 16.7 P X 4 = P X = 1 + P X = 2 +P X = 3 + P X = 4 = 0.06 + 0.94 0.06 + 0.94 2 0.06 + 0.94 3 0.06 0.2193 25 Stats: Modeling the World 4e

The Geometric Model: Waiting for Success Tell Conclusion Interpret your results in context Blood drive such as this one expect to examine an average of 16.7 people to find a universal donor. About 22% of the time there will be one within the first 4 people in line. 26 Stats: Modeling the World 4e

The Geometric Model: Waiting for Success TI Tips: Finding Geometric Probabilities Your TI knows the geometric model. The commands to calculate probability distributions are found in the 2nd DISTR menu. Have a look. After many others (Yes, there s still more to learn!) you ll see two Geometric probability functions at the bottom of the list. 27

geometpdf( The Geometric Model: Waiting for Success TI Tips: Finding Geometric Probabilities The pdf stands for probability density function. This command allows you to find the probability of and individual outcome. You need only specify the p. which defines the Geometric model, and x, which indicates the number of trials until you get a success. The format is geometpdf(p,x). For example, suppose we want to know the probability that we find out first LeBron James picture in the fifth box of cereal. Since LeBron is in 20% of the boxes, we enter geometpdf(0.2,5) and hit ENTER. What does the calculator give us?.08192 Stats: Modeling the World 4e 28

geometcdf( The Geometric Model: Waiting for Success TI Tips: Finding Geometric Probabilities The cdf stands for cumulative density function, meaning that it finds the sum of the probabilities of several possible outcomes. In general, the command geometcdf(p,x) calculates the probability of finding the first success on or before the x th trial. Let s find the probability of getting a LeBron James picture by the time we open the fourth box of cereal in other words, the probability our first success comes on the first box, or the second, or the third, or the fourth. Again we specify p = 0.2, and now x = 4. So, using geometcdf(.2,4) gives us a probability of.5904 Stats: Modeling the World 4e 29

Geometric Model Quick Review A Geometric model is used for what purpose? Calculating the number of Bernoulli trials that need to be performed until the next success What are the two Geometric probability functions our calculators perform and what do they find? Geometpdf probability density function Calculates the probability of when a first event occurs Geometcdf cumulative density function Calculates the probability of an event occurring anywhere in the first x events. 30

The Binomial Model: Counting Success We can use the Bernoulli trials to answer other common questions. Suppose you buy 5 boxes of cereal. What s the probability you get exactly 2 pictures of LeBron James? Before, we asked how long it would take until our first success. Now, we want to find the probability of getting 2 successes among the 5 trials. We are still talking about Bernoulli trials, but we re asking a different question. 31

The Binomial Model: Counting Success This time we re interested in the number of successes in the 5 trials, so we ll call it X = number of successes. We want to find P(X = 2). This is an example of a Binomial probability It takes two parameters to define this Binomial model: the number of trials, n, and the probability of success, p We denote this model Binom(n, p). In this example, n = 5 trials, and p = 0.2, the probability of finding a LeBron James card in any trial. 32

The Binomial Model: Counting Success Exactly 2 successes in 5 trials means 2 successes and 3 failures. How do you suppose we might find this with what we already know? It seems logical that the probability should be (0.2) 2 (0.8) 3 Too bad: it s not that easy. What does that calculation give you? That calculation would give you the probability of finding LeBron in the first 2 boxes and not in the next 3 in that order. 33

The Binomial Model: Counting Success Couldn t you also find LeBron in the third and fifth boxes and still have 2 successes? What would this probability calculation look like? (0.8)(0.8)(0.2)(0.8)(0.2) or (0.2) 2 (0.8) 3 Hmmm, the same result from two seemingly different situations. In fact, the probability will always be the same, no matter what order the successes and failures occur in. Anytime we get 2 successes in 5 trials, regardless of order, the probability will be (0.2) 2 (0.8) 2. 34

The Binomial Model: Counting Success What do you suppose we could do to find the answer to our exactly two LeBron cards question? We just need to count all the possible orders in which the outcomes can occur. 35

The Binomial Model: Counting Success That could potentially be a lot of work. Fortunately, these possible orders are disjoint (For example, if your two successes came on the first two trials, they couldn t come on the last two.) So we could use the Addition Rule to add up the probabilities, but since they are all the same, we really only need to know how many orders are possible. For small n s, we can just make a tree diagram and count the branches. 36

The Binomial Model: Counting Success For larger numbers this isn t practical: Fortunately, there s a formula for that. Each different order in which we can have k successes in n trials is called a combination The total number of ways that can happen is written n k pronounced n choose k or nc k and 37

The Binomial Model: Counting Success nc k = n k = n! k! n k! Where n! (pronounced n factorial ) = n n 1 n 2 1 For 2 successes in 5 trials, 5 2 = 5! 2! 5 2! = 5 4 3 2 1 2 1 3 2 1 = 5 4 2 1 = 10 38

The Binomial Model: Counting Success So, there are 10 ways to get LeBron pictures in 5 boxes, and the probability of each is (0.2) 2 (0.8) 3. Now we can find what we wanted: P(#success = 2) = 10(0.2) 2 (0.8) 2 = 0.2048 In general, the probability of exactly k successes in n trials is n k pk q n k 39

The Binomial Model: Counting Success It s not hard to find the expected value for a binomial random variable. If we have 5 boxes, and LeBron s picture is in 20% of them, then we would expect to have 5(0.2) = 1 success If we had 100 trials with probability of success 0.2, how many successes would you expect? Can you think of any reason not to say 20? It seems so simple that most people wouldn t even stop to think about it. You just multiply the probability of success by n or Stats: Modeling the World 4e E(X) = np. 40

Not fully convinced? The Binomial Model: Counting Success A binomial model simply counts the number of successes in a series of n independent Bernoulli trials. Let Y = X 1 + X 2 + X 3 + + X n E(Y) = E(X 1 + X 2 + X 3 + + X n ) E(Y) = E(X 1 ) + E(X 2 ) + E(X 3 ) + + E(X n ) E(Y) = p + p + p + + p (there are n terms.) So, as we thought, the mean is E(Y) = np. 41

The Binomial Model: Counting Success The standard deviation is less obvious: you can t just rely on your intuition. Why? Since the trials are independent, the Pythagorean Theorem of Statistics tells us that the variances add: Var(Y) = Var(X 1 + X 2 + X 3 + + X n ) Var(Y) = Var(X 1 ) + Var(X 2 ) + Var(X 3 ) + + Var(X n ) Var(Y) = pq + pq + pq + + pq (Again, n terms) Var(Y) = npq Voilá! The standard deviation is SD X = npq. 42

The Binomial Model: Counting Success For Example: Spam and the Binomial Model Recap: The communications monitoring company Postini has reported that 91% of email messages are spam. Suppose your inbox contains 25 messages. Question: What are the mean and standard deviation of the number of real messages you should expect to find in your inbox? What s the probability that you ll find only 1 or 2 real messages? 43

The Binomial Model: Counting Success For Example: Spam and the Binomial Model Answer: I assume that messages arrive independently and at random, with the probability of success (a real message) p = 1 0.91 = 0.09. Let X = the number of real messages among 25. I can use the Binomial model or Binom(25, 0.09). E X = np = 25 0.09 = 2.25 SD X = npq = 25 0.09 0.91 = 1.43 44

Answer: P X = 1 or 2 = The Binomial Model: Counting Success For Example: Spam and the Binomial Model P X = 1 or 2 = P X = 1 + P X = 2 25 1 (0.09)1 0.91 24 + 25 2 (0.09)2 0.91 23 P X = 1 or 2 = 0.2340 + 0.2777 = 0.5117 Among 25 email messages, I expect to find an average of 2.25 that aren t spam, with a standard deviation of 1.43 messages. There s just over a 50% change that 1 or 2 of my 25 emails will be real messages. Stats: Modeling the World 4e 45

The Binomial Model: Counting Success Step-by-Step: Working with a Geometric Model Suppose 20 donors come to a blood drive. Recall that 6% of people are universal donors. Questions: 1. What are the mean and standard deviation of the number of universal donors among them? 2. What is the probability that there are 2 or 3 universal donors? 46

The Binomial Model: Counting Success Think Plan State the questions Variable Model Check to see that these are Bernoulli trials. Define the random variable Specify the model I want to know the mean and standard deviation of the number of universal donors among 20 people and the probability that there are 2 or 3 of them. There are two outcomes: success = O-negative and failure = other blood types The probability of success for each person is p = 0.06, because they are lined up randomly 10% Condition: Trials aren t independent because the population is finite, but the donors lined up are fewer than 1-% of all possible donors. Let X = number of donors until one is O-negative among n = 20 people. I can model X with Binom(20,0.06) 47 Stats: Modeling the World 4e

The Binomial Model: Counting Success Show Mechanics Find the expected value and standard deviation. E X = np = 20 0.06 = 1.2 SD X = npq = 20 0.06 (0.94) 1.06 Calculate the probability P X = 2 or 3 = P X = 2 + P X = 3 = 20 2 (0.06)2 (0.94) 18 + 20 3 (0.06)3 (0.94) 17 0.2246 + 0.0860 0.3106 48 Stats: Modeling the World 4e

The Binomial Model: Counting Success Tell Conclusion Interpret your results in context In groups of 20 randomly selected blood donors, I expect to find an average of 1.2 universal donors, with a standard deviation of 1.06. About 31% of the time, I d expect to find 2 or 3 universal donors among the 20 people. 49 Stats: Modeling the World 4e

The Geometric Model: Waiting for Success TI Tips: Finding Binomial Probabilities Remember how the calculator handles Geometric probabilities? Well, the commands for finding Binomial probabilities are essentially the same. Again, you ll find them in the 2nd Distr menu. 50

binompdf( The Geometric Model: Waiting for Success TI Tips: Finding Binomial Probabilities This probability density function allows you to find the probability of an individual outcome. You need to define the Binomial model by specifying n and p, and then indicate the desired number of successes, x. The format is binompdf(n,p,x). For example, recall that LeBron James s picture is in 20% of the cereal boxes. Suppose that we want to know the probability of finding LeBron exactly twice among 5 boxes of cereal. We enter binompdf(5,0.2,2) then press ENTER and get 0.2048 About a 20% chance of getting 2 LeBron pictures in 5 boxes of cereal. 51

The Geometric Model: Waiting for Success binomcdf( TI Tips: Finding Binomial Probabilities Need to add several Binomial probabilities? To find the total probability of getting x or fewer successes among n trials use the cumulative Binomial density function binomcfd(n,p,x). For example, suppose we have ten boxes of cereal and wonder about the probability of finding up to 4 pictures of LeBron. That s the probability of 0, 1, 2, 3, or 4 successes, so using binomcdf(10,0.2,2) in the calculator we get 0.6778. Pretty likely! 52

The Geometric Model: Waiting for Success TI Tips: Finding Binomial Probabilities Of course up to 4 allows for the possibility that we end up with none. What s the probability we get at least 4 pictures of LeBron in 10 boxes? Well, at least 4 means not 3 or fewer. That s the compliment of 0, 1, 2, or 3 successes. What would we type in the calculator to find this probability? 1 binomcdf(10,0.2,3) 12.09 There s about a 12% chance we ll find at least 4 pictures of LeBron in 10 boxes of cereal. 53

The Geometric Model: Waiting for Success Just Checking The Pew Research Center reports that they are only able to contact 76% of randomly selected households drawn for telephone surveys. Suppose a pollster has a list of 12 calls to make. a) Why can these phone calls be considered Bernoulli trials. b) Find the probability that the fourth call is the first one that makes contact. c) Find the expected number of successful calls out of the 12. d) Find the standard deviation of the number of successful calls. e) Find the probability that exactly 9 of the 12 calls are successful. f) Find the probability that at least 9 of the calls are successful. Stats: Modeling the World 4e 54

The Normal Model to the Rescue Suppose the Tennessee Red Cross anticipates the need for at least 1850 units of O-negative blood this year. It estimates that it will collect blood from 32,000 donors. How great is the risk that the Tennessee Red Cross will fall short of meeting its need? We ve just learned how to calculate such probabilities. We can use the Binomial model with n = 32,000 and p = 0.06. The probability of getting exactly 1850 units of O-negative blood from 32,000 donors is 55

The Normal Model to the Rescue 32000 1850 0.061850 0.94 30150 No calculator on earth can calculate that first term (it has more than 100,000 digits). And that s just the beginning. The problem said at least 1850, so we have to do it again for 1851, for 1852, and all the way up to 32,000. YIKES! 56

The Normal Model to the Rescue When we re dealing with a large number of trials like this, making direct calculations of the probabilities becomes tedious (or outright impossible). Here an old friend the Normal model comes to the rescue. YAY! The Binomial model mean np = 1920 and standard deviation npq 42.48. We could try approximating its distribution with a Normal model, using the same mean and standard deviation. Remarkably enough, that turns out to be a very good approximation. We ll see why in the next unit. 57

The Normal Model to the Rescue With that approximation, we can find the probability: P X < 1850 = P z < 1850 1920 42.48 P(z < 1.65) 0.05 There seems to be about a 5% chance that this Red Cross chapter will run short of O-negative blood. Stats: Modeling the World 4e 58

The Normal Model to the Rescue Can we always use a Normal model to make estimates of Binomial probabilities? No. Consider the LeBron James situation pictures in 20% of the cereal boxes. If we buy five boxes, the actual Binomial probabilities that we get 0, 1, 2, 3, 4, or 5 pictures of LeBron are 33%, 41%, 20%, 5%, 1%, and 0.03% respectively. The histogram at the right clearly shows this probability model is skewed, thus we should not try to estimate these probabilities by using a normal model. Stats: Modeling the World 4e 0 1 2 3 4 5 59

The Normal Model to the Rescue Now suppose we open 50 boxes of this cereal and count the number of LeBron pictures we find. The histogram below shows this probability model. It is centered at np = 50(0.2) = 10 pictures, as expected. I appears to be fairly symmetric around that center. Stats: Modeling the World 4e 0 12 24 36 48 60

The Normal Model to the Rescue This third histogram shows Binom(50,0.2) magnified somewhat and centered at the expected value of 10 pictures of LeBron. It looks close to Normal, for sure. With this larger sample size, it appears that a Normal model might be a useful approximation. Stats: Modeling the World 4e 0 3 6 9 12 15 18 21 61

The Normal Model to the Rescue A Normal model, then, is a close enough approximation only for a large enough number of trials. And what we mean by large enough depends on the probability of success. We d need a larger sample if the probability of success were very low (or very high). It turns out that a Normal model works pretty well if we expect to see at least 10 successes and 10 failures. That is, we check the Success/Failure Condition. The Success/Failure Condition: A Binomial model is approximately Normal if we expect at least 10 successes and 10 failures: Stats: Modeling the World 4e np 10 and nq 10 62