Lecture 4: The binomial distribution

Lecture 4: The binomial distribution 4th of November 2015 Lecture 4: The binomial distribution 4th of November 2015 1 / 26

Combination and permutation Recapitulatif) Consider 7 students applying to a college for 3 places: Abi Ben Claire Dave Emma Frank Gail How many ways are there of choosing 3 students from 7 when 1 order is important permutation) i.e. Abi, Dave) Dave, Abi) 2 in not important combination) i.e. Abi, Dave)= Dave, Abi) Lecture 4: The binomial distribution 4th of November 2015 2 / 26

Permutations There are 3 places to fill at the college and 7 students applied. There will be 7 6 5 permutations: 7 6 5 = 7 6 5 4 3 2 1 4 3 2 1 = 7! 4! = 7 P 3. In general, the number of permutations of r objects from n is n P r = n! n r)!. Lecture 4: The binomial distribution 4th of November 2015 3 / 26

Combinations For every combination of 3 students, there will be 3! = 6 permutations. For example, the combination Dave, Claire and Abi gives rise to the permutations ACD ADC CAD CDA DAC DCA If we are not interested in the order, the number of combinations of 3 students among 7 is then 7 P 3 7 C 3 =. 3! In general, the number of combinations of r objects from n is n C r = n P r r! = n! n r)!r! Lecture 4: The binomial distribution 4th of November 2015 4 / 26

An example of the Binomial distribution An unfair coin: P Head) = 2/3 and P T ail) = 1/3 Let X = No. of heads observed in 5 coin tosses X can take on any of the values 0, 1, 2, 3, 4, 5 X is a discrete random variable Some values of X will be more likely to occur than others. Each value of X will have a probability of occurring. What are these probabilities? Lecture 4: The binomial distribution 4th of November 2015 5 / 26

What is P X = 1)? One possible way of observing Head once is if we observe the pattern HT T T T. The probability of obtaining this pattern is PHTTTT) = 2 3 1 3 1 3 1 3 1 3 Lecture 4: The binomial distribution 4th of November 2015 6 / 26

There are 32 possible patterns of Head and Tails we might observe. HHHHH THHHH HTHHH HHTHH HHHTH HHHHT TTHHH THTHH THHTH THHHT HTTHH HTHTH HTHHT HHTTH HHTHT HHHTT TTTHH TTHTH TTHHT THTTH THTHT THHTT HTTTH HTTHT HTHTT HHTTT HTTTT THTTT TTHTT TTTHT TTTTH TTTTT Five of the patterns contain just one Head. The other 5 possible combinations all have the same probability so the probability of obtaining one head in 5 coin tosses is ) PX = 1) = 5 2 3 1 3 )4 0.0412 Lecture 4: The binomial distribution 4th of November 2015 7 / 26

What about P X = 2)? This probability can be written as P X = 2) = No. of patterns Probability of pattern 2 ) 2 1 3 = 5 C 2 3 3) = 10 4 243 0.165 In general, the probability to observe x Head and 5 x Tail) is 2 ) x 1 5 x) P X = x) = 5 C x 3 3) Lecture 4: The binomial distribution 4th of November 2015 8 / 26

We can use this formula to tabulate the probabilities of each possible value of X. PX = 0) = 5 C 0 2 3) 0 1 3) 5 0.0041 PX = 1) = 5 C 1 2 3) 1 1 3) 4 0.0412 PX = 2) = 5 C 2 2 3) 2 1 3) 3 0.1646 PX = 3) = 5 C 3 2 3) 3 1 3) 2 0.3292 PX = 4) = 5 C 4 2 3) 4 1 3) 1 0.3292 PX = 5) = 5 C 5 2 3) 5 1 3) 0 0.1317 Lecture 4: The binomial distribution 4th of November 2015 9 / 26

Distribution of probabilities across the possible values of X. PX) 0.0 0.1 0.2 0.3 0.4 0.5 0 1 2 3 4 5 X This situation is a specific example of a Binomial distribution. Lecture 4: The binomial distribution 4th of November 2015 10 / 26

Key components of the binomial distribution In general a Binomial distribution arises when we have the following 4 conditions: - n identical trials, e.g. 5 coin tosses - 2 possible outcomes for each trial success and failure, e.g. Heads or Tails - Trials are independent, e.g. each coin toss doesn t affect the others - P success ) = p is the same for each trial, e.g. PHead) = 2/3 is the same for each trial Lecture 4: The binomial distribution 4th of November 2015 11 / 26

The binomial distribution If we have the above 4 conditions then if we let X = No. of successes then the probability of observing x successes out of n trials is given by PX = x) = n C x p x 1 p) n x) x = 0, 1,..., n If the probabilities of X are distributed in this way, we write X Binn, p) n and p are called the parameters of the distribution. We say X follows a binomial distribution with parameters n and p. Lecture 4: The binomial distribution 4th of November 2015 12 / 26

Example 1 Suppose X Bin10, 0.4), what is PX = 7)? Here we have: n = 10, p = 0.4, x = 7, PX = 7) = 10 C 7 0.4) 7 1 0.4) 10 7) = 120)0.4) 7 0.6) 3 0.0425 Lecture 4: The binomial distribution 4th of November 2015 13 / 26

Example 2 Suppose Y Bin8, 0.15), what is PY < 3)? Here we have: n = 8, p = 0.15, PY < 3) = PY = 0) + PY = 1) + PY = 2) = 8 C 0 0.15) 0 0.85) 8 + 8 C 1 0.15) 1 0.85) 7 + 8 C 2 0.15) 2 0.85) 6 0.2725 + 0.3847 + 0.2376 0.8948 Note that 1 p = 0.85. Lecture 4: The binomial distribution 4th of November 2015 14 / 26

Example 3 Suppose W Bin50, 0.12), what is PW > 2)? Here we have: n = 50, p = 0.12, PW > 2) = PW = 3) + PW = 4) +... + PW = 50) = 1 PW 2) ) = 1 PW = 0) + PW = 1) + PW = 2) = 1 50 C 0 0.12) 0 0.88) 50 + 50 C 1 0.12) 1 0.88) 49 + 50 C 2 0.12) 2 0.88) 48) ) 1 0.00168 + 0.01142 + 0.03817 0.94874 Note that 1 p = 0.88. Lecture 4: The binomial distribution 4th of November 2015 15 / 26

Different values of n and p lead to different distributions with different shapes: PX) 0.0 0.1 0.2 0.3 0.4 0.5 n=10 p=0.5 0 2 4 6 8 10 X PX) 0.0 0.1 0.2 0.3 0.4 0.5 n=10 p=0.1 0 2 4 6 8 10 X PX) 0.0 0.1 0.2 0.3 0.4 0.5 n=10 p=0.7 0 2 4 6 8 10 X Lecture 4: The binomial distribution 4th of November 2015 16 / 26

Expected mean and expected standard deviation We have seen in the first lecture that the sample mean and standard deviation can be used to summarize the shape of a dataset. In the case of a probability distribution we have no data as such so we must use the probabilities to calculate the expected mean and standard deviation. Lecture 4: The binomial distribution 4th of November 2015 17 / 26

Example: X Bin5, 2/3) Consider the example of the Binomial distribution we saw above x 0 1 2 3 4 5 PX = x) 0.004 0.041 0.165 0.329 0.329 0.132 The expected mean value of the distribution, denoted µ can be calculated as µ = 0 0.004) + 1 0.041) + 2 0.165) + 3 0.329) = 3.333 +4 0.329) + 5 0.132) Lecture 4: The binomial distribution 4th of November 2015 18 / 26

Expected mean and expected standard deviation In general, there is a formula for the mean of a Binomial distribution. There is also a formula for the standard deviation, σ. If X Binn, p) then µ = np σ = npq where q = 1 p In the example above, X Bin5, 2/3) and so the mean and standard deviation are given by µ = np = 5 2/3) = 3.333 and σ = npq = 5 2/3) 1/3) = 1.111 Lecture 4: The binomial distribution 4th of November 2015 19 / 26

Testing a hypotheses using the Binomial distribution An example Consider the following simple situation: You have a six-sided die, and you have the impression that it s somehow been weighted so that the number 1 comes up more frequently than it should. How would you decide whether this impression is correct? Lecture 4: The binomial distribution 4th of November 2015 20 / 26

You could do a careful experiment, where you roll the die 60 times, and count how often the 1 comes up. Suppose you do the experiment, and the 1 comes up 30 times and other numbers come up 30 times all together). If the die is unbiased, you expect the 1 to come up one time in six, i.e. 10 times. Therefore 30 times seems high. But is it too high? There are two possible hypotheses: 1 The die is biased. 2 Just by chance we got more 1 s than expected. How do we decide between these possibilities? Lecture 4: The binomial distribution 4th of November 2015 21 / 26

Perform an hypothesis test. Hypothesis: The die is fair. All 6 outcomes have the same probability. Experiment: We roll the die 60 times. Sample: We obtain 60 outcomes and the 1 comes out 30 times. Assuming our hypothesis is true the experiment we carried out satisfies the conditions of the Binomial distribution n identical trials, i.e. 60 die rolls. 2 possible outcomes for each trial: 1 and not 1. Trials are independent. P success ) = 1/6 is the same for each trial Lecture 4: The binomial distribution 4th of November 2015 22 / 26

We define X = No. of 1 s that come up. We observed X = 30. We can calculate the probability of observing X=30 if our hypothesis is true, i.e. if X Bin60,1/6): P X = 30) = 60 C 30 1 6 ) 30 ) 5 60 30 2.25 10 9. 6 Conclusion: Under the hypothesis that the die is fair, the probability that the number of 1 s come up 30 times in this experiment is very low. Therefore we may conclude that the die has been biased. Lecture 4: The binomial distribution 4th of November 2015 23 / 26

Hypothesis testing Now we summarise the general approach: posit a hypothesis design and carry out an experiment to collect a sample of data test to see if the sample is consistent with the hypothesis Testing the hypothesis: Assuming our hypothesis is true what is the probability that we would have observed such a sample or a sample more extreme, i.e. is our sample quite unlikely to have occurred under the assumptions of our hypothesis? Lecture 4: The binomial distribution 4th of November 2015 24 / 26

Example: Drug efficiency Until recently an average of 60 out 100 patients have survived a particular severe infection. When a new drug was administered to 15 patients with the infection, 12 of them survived. Does this provide evidence that the new drug is effective? Lecture 4: The binomial distribution 4th of November 2015 25 / 26

Hypothesis: The drug is not effective, i.e. the probability of surviving is still p = 0.6. Experiment: We test the drug on 15 patients with the infection. Sample: 12 patients survived. Let X denote the number of patients who survived. Under our hypothesis, X Bin15,0.6) We compute the probability that we would have observed such a sample assuming our hypothesis is true: P X = 12) = 15 C 12 0.6) 12 0.4) 15 12 0.063. There is more than 6% chance of observing such a number of surviving patients if the drug in not effective. Therefore it may be just by chance that we observe such a number of patients who survived. Lecture 4: The binomial distribution 4th of November 2015 26 / 26