The Binomial Distribution

Similar documents
Chapter 8: The Binomial and Geometric Distributions

Binomial Distributions

The Binomial Probability Distribution

4.2 Bernoulli Trials and Binomial Distributions

Chapter 3 - Lecture 5 The Binomial Probability Distribution

Chapter 6: Random Variables. Ch. 6-3: Binomial and Geometric Random Variables

Probability Theory. Mohamed I. Riffi. Islamic University of Gaza

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

Chapter 14 - Random Variables

MA : Introductory Probability

5.1 Sampling Distributions for Counts and Proportions. Ulrich Hoensch MAT210 Rocky Mountain College Billings, MT 59102

Binomal and Geometric Distributions

Discrete Random Variables and Their Probability Distributions

Some Discrete Distribution Families

AP Statistics Ch 8 The Binomial and Geometric Distributions

Binomial and Geometric Distributions

Binomial and Normal Distributions. Example: Determine whether the following experiments are binomial experiments. Explain.

3.2 Binomial and Hypergeometric Probabilities

Chpt The Binomial Distribution

Binomial Probabilities The actual probability that P ( X k ) the formula n P X k p p. = for any k in the range {0, 1, 2,, n} is given by. n n!

Week 7. Texas A& M University. Department of Mathematics Texas A& M University, College Station Section 3.2, 3.3 and 3.4

Discrete Random Variables

Probability Models.S2 Discrete Random Variables

Probability Models. Grab a copy of the notes on the table by the door

CHAPTER 4 DISCRETE PROBABILITY DISTRIBUTIONS

Chapter 8. Binomial and Geometric Distributions

8.4: The Binomial Distribution

Binomial formulas: The binomial coefficient is the number of ways of arranging k successes among n observations.

Overview. Definitions. Definitions. Graphs. Chapter 5 Probability Distributions. probability distributions

The Binomial and Geometric Distributions. Chapter 8

Probability Theory and Simulation Methods. April 9th, Lecture 20: Special distributions

5.2 Random Variables, Probability Histograms and Probability Distributions

Binomial Random Variables. Binomial Random Variables

Section 8.4 The Binomial Distribution

Lecture 23. STAT 225 Introduction to Probability Models April 4, Whitney Huang Purdue University. Normal approximation to Binomial

4.3 Normal distribution

Chapter 8 Homework Solutions Compiled by Joe Kahlig

5. In fact, any function of a random variable is also a random variable

Discrete Probability Distributions

Engineering Statistics ECIV 2305

1. Steve says I have two children, one of which is a boy. Given this information, what is the probability that Steve has two boys?

7. For the table that follows, answer the following questions: x y 1-1/4 2-1/2 3-3/4 4

STOR Lecture 7. Random Variables - I

Statistics (This summary is for chapters 17, 28, 29 and section G of chapter 19)

Section Random Variables

5.4 Normal Approximation of the Binomial Distribution

Section Distributions of Random Variables

Probability Distributions: Discrete

Chapter 6: Discrete Probability Distributions

Normal distribution Approximating binomial distribution by normal 2.10 Central Limit Theorem

Part 10: The Binomial Distribution

Chapter 7. Sampling Distributions and the Central Limit Theorem

Midterm Exam III Review

Probability Distributions

Chapter Five. The Binomial Distribution and Related Topics

Mean of a Discrete Random variable. Suppose that X is a discrete random variable whose distribution is : :

CHAPTER 6 Random Variables

Chapter 5 Discrete Probability Distributions. Random Variables Discrete Probability Distributions Expected Value and Variance

30 Wyner Statistics Fall 2013

Test 6A AP Statistics Name:

Chapter 7. Sampling Distributions and the Central Limit Theorem

6.3: The Binomial Model

Random Variables CHAPTER 6.3 BINOMIAL AND GEOMETRIC RANDOM VARIABLES

MATH 118 Class Notes For Chapter 5 By: Maan Omran

Part 1 In which we meet the law of averages. The Law of Averages. The Expected Value & The Standard Error. Where Are We Going?

Chapter 3 - Lecture 3 Expected Values of Discrete Random Va

Mathacle. PSet Stats, Concepts In Statistics Level Number Name: Date: Distribution Distribute in anyway but normal

Central Limit Theorem (cont d) 7/28/2006

Experimental Probability - probability measured by performing an experiment for a number of n trials and recording the number of outcomes

Lesson 97 - Binomial Distributions IBHL2 - SANTOWSKI

Opening Exercise: Lesson 91 - Binomial Distributions IBHL2 - SANTOWSKI

Section Random Variables and Histograms

Section 7.5 The Normal Distribution. Section 7.6 Application of the Normal Distribution

Probability Distributions for Discrete RV

3. The n observations are independent. Knowing the result of one observation tells you nothing about the other observations.

Discrete Random Variables

Discrete Random Variables

3. The n observations are independent. Knowing the result of one observation tells you nothing about the other observations.

4 Random Variables and Distributions

STT315 Chapter 4 Random Variables & Probability Distributions AM KM

Chapter 5. Sampling Distributions

II - Probability. Counting Techniques. three rules of counting. 1multiplication rules. 2permutations. 3combinations

Chapter 3: Probability Distributions and Statistics

The Central Limit Theorem. Sec. 8.2: The Random Variable. it s Distribution. it s Distribution

The binomial distribution

Section 8.4 The Binomial Distribution

Math489/889 Stochastic Processes and Advanced Mathematical Finance Homework 5

PROBABILITY AND STATISTICS CHAPTER 4 NOTES DISCRETE PROBABILITY DISTRIBUTIONS

Normal Distribution. Notes. Normal Distribution. Standard Normal. Sums of Normal Random Variables. Normal. approximation of Binomial.

Statistics for Business and Economics

Discrete Random Variables and Probability Distributions. Stat 4570/5570 Based on Devore s book (Ed 8)

Section Distributions of Random Variables

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

Chapter 3 - Lecture 4 Moments and Moment Generating Funct

Statistical Methods in Practice STAT/MATH 3379

Econ 6900: Statistical Problems. Instructor: Yogesh Uppal

Discrete Random Variables

Unit 5: Sampling Distributions of Statistics

Binomial Distribution. Normal Approximation to the Binomial

Unit 5: Sampling Distributions of Statistics

Transcription:

MATH 382 The Binomial Distribution Dr. Neal, WKU Suppose there is a fixed probability p of having an occurrence (or success ) on any single attempt, and a sequence of n independent attempts is made. Then the binomial random variable, denoted by X ~ b(n, p), counts the total number of occurrences in these n attempts. There may be anywhere from 0 to n occurrences in the n attempts; thus, Range X = {0, 1,..., n}. For example, roll one fair six-sided die 20 times and count the number of fours that are rolled. The probability of rolling a four on any attempt is p = 1 / 6. So X ~ b(20, 1 / 6). Somewhere from 0 to 20 fours will be rolled, and X counts the exact number of fours rolled in a sequence of 20 rolls. With p being the probability of success on any attempt, we shall let q = 1 p be the probability of failure. For instance, if we are trying to draw a Heart from a complete shuffled deck of cards, then p = 1 / 4 and q = 3 / 4. We note that there are various ways to obtain exactly k successes in n attempts. For example, to have exactly 3 successes in 8 attempts, we must choose which 3 of the 8 tries resulted in the successes: S S S F F F F F S S F S F F F F S S F F S F F F... S S F F F F F S... F F F F F S S S 8 In all there are # & = 56 ways to choose 3 success positions out of 8 spots. But each of 3% these 56 individual sequences occurs with probability p 3 q 5 n. In general, there are # & k% ways to have exactly k successes in n attempts and each such individual sequence occurs with probability p k q n k. Thus the overall probability of obtaining exactly k n successes in n attempts is # & k% p k q n k. Probability Distribution Function The probability distribution function (or pdf) of X ~ b(n, p) is the probability of having exactly k occurrences in n attempts and is given by P(X = k) = n # & k% p k q n k, for, 1,..., n. By the Binomial Expansion Theorem and the fact that p + q = 1, the sum of the probabilities over the range of X equals 1: P(X = k ) = k Range X n # n % & ( p k q n k = ( p + q) n = 1 n = 1. k '

There is no closed-form formula for the cumulative probability P(X k) or for computing probabilities such as P( j X k). In each case, the individual probabilities must be summed, or we must use a calculator/computer command: k k n P(X k) = P( X = i ) = # & i = 0 i % p i q n i and P( j X k) = i = 0 k n # & i % p i q n i. i = j Example 1. Draw a single card at random from a shuffled deck over and over, each time with replacement and re-shuffling. Do so for a total of 8 draws. (a) What is the probability of exactly 3 Hearts being drawn? (b) What is the probability of at most 3 Hearts being drawn? (c) What is the probability of at least 2 Hearts being drawn? Solution. The probability of drawing a Heart on any attempt is p =1 / 4. Because we have replacement and re-shuffling of cards, the 8 successive draws are independent of each other. So we have a binomial distribution and we may let X ~ b(8,1 / 4) count the number of Hearts drawn in the 8 attempts. (a) We now have P(X = 3) = 8 # & 3% # 1 & 3 3 # & 5 0.2076416. 4% 4 % (b) Also, P( X 3) = P( X = 0) + P( X = 1) + P( X = 2) + P( X = 3) = 8% 1% ' ' 0 3% # 0& # 4& # 4& ' 8 + 8% 1% # 1 ' ' 1 3 % &# 4& # 4 & ' 7 + 8% 1 % # 2 ' ' 2 3 % &# 4 & # 4 & ' 6 + 8% 1 % # 3 ' ' 3 3% & # 4 & # 4& = 3 % # 4 & ' 8 + 8 1 % # 4 ' 1 3 % & # 4 & ' 7 + 28 1 % # 4 ' 2 3 % & # 4 & ' 6 + 56 1 % # 4 ' 3 3% ' 5 & # 4& 0. 8861847. (c) Finally, the probability of at least 2 Hearts being drawn is P( X 2) = 1 P(X 1) = 1 P(X = 0) P(X = 1) = 1 & 8' 1 ' )& ) 0 3 ' & % 0( % 4 ( % 4 ( ) 8 & 8' % 1 )& (% 0. 63292. 1 ' ) 1 3 ' & 4 ( % 4 ( ) 7 ' 5 Expected Value, Variance, and Standard Deviation The expected value (or mean µ ) of the binomial distribution is the average number of occurrences in repeated sequences of n attempts. This value is easily computed by µ = E[ X ] = n p. We shall derive it in two ways.

Theorem 1. Let X ~ b(n, p). Then E[ X ] = n p. Proof 1. Using the direct definition of expected value of a discrete random variable, we have E[ X ] = k P(X = k ) = n k # % n& k Range X k ( p ' k q n k n n = k k(n k ) pk q n k (first term cancels ) k =1 n n = k = 1 (k 1)(n k) pk q n k n (n 1) = np k =1 (k 1)(n k ) pk 1 q n k n 1 (n 1) = np k (n k 1) pk q n k 1 n 1 # n 1& = np % ( p k ' k q n 1 k n 1 = np P( b(n 1, p) = k ) = np(1) = np. Proof 2. For i = 0, 1,..., n, let X i be a Bernoulli random variable that is 1 if there is a success on the i th attempt or 0 if there is a failure on the i th attempt. Then the X i are independent of each other (due to independent attempts) and the total number of successes in n attempts is given by X 1 + X 2 +... + X n = X ~ b(n, p). Moreover, E[ X i ] = 1p + 0q = p for i = 0, 1,..., n. By the linearity of expected value, we then have E[ X ] = E[ X 1 + X 2 +... + X n ] = E[ X 1 ] + E[X 2 ] +... + E[ X n ] = p + p +... + p = n p. Using the set-up from this second proof, we can easily derive the variance of X as well: Theorem 2. Let X ~ b(n, p). Then Var( X) = n pq. Proof. Again, for i = 0, 1,..., n, let X i be independent Bernoulli random variables that are 1 if there is a success on the i th attempt or 0 if there is a failure on the i th attempt. Then Var( X i ) = E[ X i 2 ] (E[Xi ]) 2 Then by independence of the X i, we have = (1 2 p + 0 2 q) p 2 = p p 2 = p(1 p) = pq. Var( X) = Var(X 1 + X 2 +... + X n ) = Var( X 1 ) + Var(X 2 ) +... + Var( X n ) = pq + pq +... + pq = n p q. So for X ~ b(n, p), the standard deviation is given by σ X = n pq.

Mode The mode is the most likely number of occurrences in n attempts. This value is always given by the largest integer k such that k (n + 1) p, denoted by k = (n + 1)p. However, when (n +1) p is actually an integer, then the binomial distribution will be bimodal (i.e., it will have two modes). In this case, this value of k = (n +1) p and the previous integer k 1 will be the two modes. To derive this expression for the mode, we let a k = P( X = k) = # n & k% p k q n k. We then wish to find the largest integer k such that a k 1 a k. (We then have a k > a k +1 > a k+2 >...> a n making k have the highest probability value a k.) We now consider the ratio a k / a k 1. To have a k 1 a k, we must have 1 a k / a k 1, which holds if and only if 1 a k a k 1 = n k(n k) pk q n k n k +1 n = p (k 1)(n k + 1) pk 1 q n k +1 k q. This result holds if and only if k (1 p) (n k + 1) p which is equivalent to k (n + 1)p. Note that we have a k 1 = a k if and only if k = (n + 1) p, which holds if and only if (n +1) p is an integer. Example 2. Draw a single card from a deck, with replacement and re-shuffling for a total of 10 draws, and count the number of Hearts drawn. For 10 draws, what are the average and standard deviation of the number of Hearts drawn? What is the most likely number of Hearts drawn? Solution. Here X ~ b(10, 1 / 4) counts the number of Hearts drawn in a sequence of 10 independent draws. On average, we would draw E[ X ] = 10 1 / 4 = 2. 5 Hearts, with a standard deviation of σ X = 10 1 / 4 3 / 4 1. 37. The most likely number of Hearts in 10 draws is k = 11 1 / 4# = 2. 75# = 2. Moreover, P(X = 2) = 10 # &(1 / 4) 2 % 2 (3 / 4) 8 0.2815676. Example 3. Suppose we roll two dice 44 times and we count the number of times that we roll a sum of either 7 or 11. (a) What is the most likely number of rolls that will be either a 7 or 11? (b) What is the probability of there being (i) at most 10 rolls that are either a 7 or 11? (ii) at least 10 rolls that are either a 7 or 11? (c) What is the probability that the number of times we roll a 7 or an 11 is within one standard deviation of average?

Solution. On each roll, the probability of getting a sum of either a 7 or 11 is p = 8 / 36 = 2 / 9. So X ~ b(44, 2 / 9) counts the number of times we do so in 44 rolls. (a) In 44 rolls, the most likely number of times a 7 or 11 will be rolled is k = 45 2 / 9# = 10# = 10. Because this value is an integer, X is bi-modal and it is most likely and equally likely that a 7 or 11 is rolled either 9 times or 10 times. Moreover, P(X = 9) = 44 2 # &# & 9 7 # & 35 = 9 % 9% 9 % 44 2 # & # & 10 7 # & 34 10 % 9 % 9% = P( X = 10) 0.141786. 10 44% 2 % (b) Next, P( X 10) = ' ' k 7% ' 44 k # k & # 9 & # 9& 0. 6152, and 9 44' 2' P( X 10) = 1 P(X 9) = 1 & )& ) k 7' & ) 44 k % k (% 9( % 9 ( 0. 52658. (c) The average number of rolls resulting in a 7 or 11 is µ = 44 2/9 = 9. 777... with a standard deviation of σ = 44 2 / 9 7 / 9 2.7577. So the range µ ± σ is from about 7.02 to 12.535. This range does not include 7, so the possible values in this range are 8, 9, 10, 11, 12. Thus, P ( µ σ X µ + σ ) = P(8 X 12) = 12 44' 2 ' & ) & ) k 7' & ) 44 k % k ( % 9 ( % 9( 0. 6312. k = 8 Calculator Commands Binomial probability values can be computed with the built-in binompdf( and binomcdf( TI commands from the DISTR menu: P( X = k ) = binompdf(n,p,k) P( X k) = binomcdf(n,p,k) P( j X k ) = P( X k) P( X j 1) = binomcdf(n,p,k) binomcdf(n,p,j 1) P( X k) = 1 P( X k 1) = 1 binomcdf(n,p,k 1) For X ~ b(44, 2 / 9), (a) P( X = 10) = binompdf(44,2/9,10) 0.1417864 (b) P( X 10) = binomcdf(44,2/9,10) 0.6152 P( X 10) = 1 binomcdf(44,2/9,9) 0.52658 (c) P(8 X 12) = binomcdf(44,2/9,12) binomcdf(44,2/9,7) 0.6312

Mathematica Commands We also can use Mathematica commands to compute binomial probability values: X=BinomialDistribution[44,2/9]; PDF[X,10] CDF{X,10] Mean[X] StandardDeviation[X] Notes: (i) The binomial distribution models sampling with replacement. For example, when drawing a card from a deck, if we always replace the card drawn before redrawing, then we create independent draws having the same probability of success. That is, we create a binomial distribution. If we do not replace cards between draws, then we have dependent draws with differing probabilities of success, and we do not have a binomial distribution. (ii) A Bernoulli distribution X is a special case of a binomial distribution having exactly 1 attempt. That is, X ~ b(1, p). Exercises 1. Draw a single card from a deck, with replacement and re-shuffling for a total of 12 draws and count the number of Face cards (J, Q, K) drawn. (a) What distribution is created in counting the number of Face cards drawn? What are the average and standard deviation of the number of Face cards drawn? (b) What is the most likely number of Face cards drawn, and what is the probability of drawing this number of Face cards? (c) What is the probability of drawing at least 3 Face cards? (d) Given that you draw at least 3 Face cards, what is the probability that you draw at least 4 Face cards? 2. Roll two dice. A win is a sum of 2, 3, or 12. Let X count the number of wins in 24 rolls. (a) What distribution is created? What is the most likely number of wins? (b) What is the probability of at most 3 wins? (c) What is the probability that the number of wins is within one standard deviation of average?