Ch 8.1 The Binomial Distributions The Binomial Setting A situation where these four conditions are satisfied is called a binomial setting. 1. Each observation falls into one of just two categories, which we call success or failure. 2. There is a fixed number of n observations. 3. The n observations are all independent. 4. The probability of success, call it p, is the same for each observation. If data are produced in a binomial setting, then the random variable X = # of successes is called a binomial random variable. The probability distribution of X is called a binomial distribution. The distribution of the count X of successes in the binomial setting is the binomial distribution with parameters n and p. Abbreviated, we say X is B(n, p). The parameter n is the number of observations, and p is the probability of a success on any one observation. The possible values of X are the whole numbers from 0 to n. Example Binomial Setting o Indicate whether a binomial distribution is a reasonable model for the random variable X. Give your reasons in each case. o A manufacturer produces a large number of toasters. From past experience, the manufacturer knows that approximately 2% are defective. In a quality control procedure, we randomly select 20 toasters for testing. We want to determine the probability that no more than one of these toasters is defective. o Draw a card from a standard deck of 52 playing cards, observe the card, and replace the card within the deck. Count the number of times you draw a card in this manner until you observe a jack. page 1
Finding Binomial Probabilities Given a discrete random variable X, the probability distribution function (pdf) assigns a probability to each value of X. The command binompdf(n, p, X) calculates the binomial probability of value X. Given a random variable X, the cumulative distribution function (cdf) of X calculates the sum of the probabilities for 0, 1, 2,, up to the value X. That is, it calculates the probability of obtaining at most X successes in n trials. The command binomcdf(n, p, X) calculates the binomial probability that the variable takes on the values of 0 up to (and including) X. Example Flipping a Coin o A fair coin is flipped 6 times. o Determine the probability that the coin comes up tails exactly 5 times. o Find the probability that the coin comes up tails at least 1 time. o Find the probability that the coin comes up tails at most 3 times. o Construct a pdf table and a pdf histogram for the variable X. o Construct a cdf table and a cdf histogram for the variable X. page 2
Binomial Formulas For any positive whole number n, its factorial n! is n! = n (n 1) (n 2) 3 2 1 and 0! = 1 The number of ways of arranging k successes among n observations is given by the binomial coefficient n n! = k k!(n k)! Say binomial coefficient n choose k. Calculator: MATH PRB ncr If X is B(n, p), the possible values of X are 0, 1, 2,, n. If k is any one of these values, P(X = k) = n p k (1 p) n k k Example Flipping a Coin o A fair coin is flipped 6 times. o Determine the probability that the coin comes up tails exactly 5 times. o Find the probability that the coin comes up tails at least 1 time. page 3
Binomial Mean and Standard Deviation If X is B(n, p), then the mean is μ = np and the standard deviation is σ = np(1 p) As the number of trials n gets larger, the binomial distribution gets close to a normal distribution. When n is large, we can use normal probability calculations to approximate hard-to-calculate binomial probabilities. As a rule of thumb, use the normal approximation when n and p and q satisfy np 10 and nq 10. (note: q is equal to 1 p) The accuracy of the normal approximation improves as the sample size n increases. It is most accurate for any fixed n when p is close to ½ and least accurate when p is near 0 or 1. Example IRS o The Internal Revenue Service estimates that 8% of all taxpayers filling out long forms make mistakes. Suppose that a random sample of 10,000 forms is selected. What is the approximate probability that more than 800 forms have mistakes? page 4
Ch 8.2 The Geometric Distributions The Geometric Setting A situation where these four conditions are satisfied is called a geometric setting. 1. Each observation falls into one of just two categories, which we call success or failure. 2. The probability of success, call it p, is the same for each observation. 3. The observations are all independent. 4. The variable of interest is the number of trials required to obtain the first success. If X has a geometric distribution with probability p of success and (1-p) of failure on each observation, the possible values of X are 1, 2, 3,. If n is any one of these values, the probability that the first success occurs on the nth trial is P(X = n) = (1 p) n 1 p The probability that it takes more than n trials to see the first success is P(X > n) = (1 p) n The mean, or expected value, of X is μ =1/ p. Example Overweight Americans o A survey conducted by the Harris polling organization discovered that 80% of all Americans are overweight. Suppose that a number of randomly selected Americans are weighed. o How many Americans would you expect to weigh before you encounter the first overweight individual? o Find the probability that the fourth person weighed is the first person to be overweight. o Find the probability that it takes more than 4 people to observe the first overweight person. page 1