Binomial Coefficient This short text is a set of notes about the binomial coefficients, which link together algebra, combinatorics, sets, binary numbers and probability. The Product Rule Suppose you are at a restaurant, and the menu offers 5 starters and 10 main courses. How many different meals could be ordered? Each starter could be combined with each main, so there are 5 X 10 = 50 combinations. In general if something consists of m choices combined with n choices, there are m.n possibilities. Making Choices Suppose we chose 2 things from 4. For example if the four things are A B C and D, we could chose A and B. Or B and C. Or B and D. What are all the choices? The first thing is either A, B, C or D. Having chosen it, there will be 3 possibilities for the second. So we can have A followed by B, C or D B followed by A, C or D C followed by A, B or D D followed by A, B or C So that's AB, AC, AD, BA, BC, BD, CA, CB, CD, DA, DB and DC. There are 4 choices for the first letter and 3 for the second. So that is a total of 4X3 = 12 choices. But these 12 include things like AB and BA - the same thing in a different order. These come in pairs. So the number of choices ignoring the order is 12/2 = 6. These are AB AC AD BC BD and CD So there are 6 choices of 2 things from 4, ignoring the order. Combinations and permutations A combination is a set with no order, so ABC is the same combination as BAC or ACB. A permutation is a set in a certain order. The elements in a different order are a different permutation - so ABC and BAC are different permutations. How many permutations of ABC are there? The first item can be A or B or C. For each, we have 2 choices of the second. That only leaves 1 letter, so we have no choice of that. In other words: A BC or CB
B AC or CA C AB or BA So there are 6 permutations of 3 items. In general, how many permutations of k items are there? We can choose the first item in k ways. That leaves k-1, so we can chose the second in k-1 ways. The third can be chosen in k-2 ways. The n th can be chosen in k-n+1 ways. The last item (the k th ) can be chosen in k-k+1 = 1 way. So the total choices are k. (k-1). (k-2)... 1 This is usually written as k! (said as 'k factorial') k k! 1 1 2 2 3 6 4 24 5 120 6 720 7 5040 8 40320 9 362880 10 3628800 11 39916800 12 479001600 So there are k! permutations of k items. The factorial function increases very rapidly, as shown. Choosing k things from n We can chose the first thing in n ways, the second in n-1, the third in n-2, and the k th in n- k+1. So we have n.(n-1).(n-2)..(n-k+1) = n.(n 1).(n 2)..1 (n k).(n k 1).(n k 2)..1 = n! (n k)! choices These are permutations - it will include pairs where, for example, the first is A and the second B, together with B for the first and A for the second. The number of combinations is less. There are k! permutations of k things, so the number of combinations is n! k!(n k)! This is called a binomial coefficient (reason to be explained). It is sometimes written as ( n k) or as n C k We usually have 0 n and 0 k n (and we take 0! as 1 ) k n 0 1 2 3 4 5 0 1 1 1 1 2 1 2 1
3 1 3 3 1 4 1 4 6 4 1 5 1 5 10 10 5 1 Binomials A binomial is an expression of the form (x+y) n where n is a non-negative integer. x and y might be variables or constants. Bi - nomial means two names - these are the x and y. For example, (x+y) 2 = x 2 + 2xy + y 2 is a binomial, expanded We can expand the first few binomials: n= 0 : 1 ( anything 0 = 1 ) n=1 : x + y n=2 : x 2 + 2xy + y 2 n=3: x 3 + 3x 2 y + 3xy 2 + y 3 n=4: x 4 + 4x 3 y + 6x 2 y 2 + 4 xy 3 + y 4 n=5: x 5 + 5x 4 y + 10x 3 y 2 + 10x 2 y 4 + 5xy 4 + y 5 What patterns are there? We get a sum of terms, powers of x and y, and the sum of the powers for each term adds up to n. So for n=5, we have x 5, x 4 y, x 3 y 2 and so on. How to predict the numbers? When we multiply out (x+y) 2, why do we get 2xy? We multiply out (x+y)(x+y) and get x 2 + xy + xy + y 2. We got a term in xy from (x+y)(x+y) and (x+y)(x+y) What about (x+y) 3? We can get the term in x 3 in just one way: (x+y)(x+y)(x+y) but we can get a term in x 2 y like this (x+y)(x+y)(x+y) or (x+y)(x+y)(x+y) or
(x+y)(x+y)(x+y) So 3 cross-terms give x 2 y, so the coefficient of x 2 y is (x+y) 3 is 3. This in fact is the number of ways of choosing 2 items out of 3. So in general when we multiply out (x+y) n, we get a set of terms of the form x k y n-k For example for (x+y) 3, we have terms x 3, x 2 y, xy 2 and y 3. And the coefficient of x k y n-k is the number of ways of choosing k items from n. That is, what we called ( n k) - which is why it is called the binomial coefficient. Symbolically: n (x+ y) n = k=0 ( n k) xk y n k Programming Binomial coefficients We need to calculate n! k!(n k)! The obvious method is to calculate n!, k! and (n-k)!, and divide them. However the factorial function increases very quickly ( 12! is nearly 500 million) and we might easily get an arithmetic overflow. We can write this is an equivalent form: (n k+1)(n k+2)(n k+3)..(n 1) 1.2.3...(k+1) and in a loop, multiply by the factor in the numerator and divide by the factor in the denominator. This stops the numbers becoming very large. In JavaScript suitable for running in a web page: <script> function binco(n,k) { var result=1; for (var i=1; i<k+1; i++) { result*=(n-k+i); result/=i; } return result;
} console.log(binco(3,1)); // 3 </script> Pascal's Triangle This is the binomial coefficients written at an angle: n=0 n=1 n=2 n=3 x+y (x+y) 2 (x+y) 3 1 1 1 k=0 k=1 k=2 1 2 1 1 3 3 1 1 4 6 4 1 k=3 1 5 10 10 5 1 Pascal's Rule 1 1 1 1 2 1 1 3 3 1 1 4 6 4 1 1 5 10 10 5 1 Each number is the sum of the two above it. Practice 1. Write down the next row, for n=6 2. Write down the expansion of (x+y) 6 For the coefficient at n,k, the numbers above it are at n-1. Their k values are k and k-1. So Pascal's Rule is ( n k) = ( n 1 k ) + ( n 1 k 1)
Why does it work? These are binomial coefficients. As an example, consider the expansion of (x+y) 6 : (x+y) 6 = (x+y)(x+y) 5 = (x+y)(x 5 + 5x 4 y+10x 3 y 2 +..) How do we get the term in x 4, for example? We can get it by multiplying x by the term in x 3, or by multiplying y by the term in x 4 - and adding them - that is (x+y)(x 5 + 5x 4 y+10x 3 y 2 +..) and (x+y)(x 5 + 5x 4 y+10x 3 y 2 +..) so the coefficient of x 4 will be 5 + 10 : which is what Pascal's Rule says. Proof of Pascal's Rule Or we can prove it from the definition using factorials. We want to show ( n k) = ( n 1 k ) + ( n 1 k 1) The RHS is (n 1)! k!(n 1 k)! + (n 1)! (k 1)!(n 1 (k 1))! but (n k 1)!= (n k)! n k so RHS = (n 1)!(n k) k!(n k)! (n 1)! + (k 1)!(n k)! But (k 1)!= k! k so RHS = (n 1)!(n k) + (n 1)!k k!(n k)! k!(n k)! = = (n 1)! k!(n k)! (n k+k) n! k!(n k)! =LHS Practice Prove Pascal's Rule by induction. You must prove that 1. It is true for n=k=0 2. If it is true for n, it is true for n+1
3. If it is true for k, it is true for k+1 so it is true for all n and k Sum of rows What do we get if we add a row? k n 0 1 2 3 4 5 Total 0 1 1 1 1 1 2 2 1 2 1 4 3 1 3 3 1 8 4 1 4 6 4 1 16 5 1 5 10 10 5 1 32 The total for row n is 2 n Practice Why? Try and answer this before reading the next section. We have said n (x+ y) n = k=0 ( n k) xk y n k The x and y terms are constants or variables. Suppose we chose x=y=1. Then n 2 n = k=0 ( n k) So that's why. Power set Consider the n=5 row: 5 1 5 10 10 5 1 The left-most 1 means there is just 1 way to chose 5 things from 5. The second 5 means there are 5 ways to choose 4 things from 5 ( if the things are ABCDE,
the ways are BCDE, ACDE, ABDE, ABCE and ABCD ) We can think of these as being subsets of the set { A, B, C, D, E } So there is only 1 subset of this containing 5 elements : { A, B, C, D, E } There are 5 subsets containing 4 elements: {B,C,D,E}, {A,C,D,E}, {A,B,D,E}, {A,B,C,E} and {A,B,C,D} There are 10 subsets containing 3 elements Practice What are they? 10 subsets contain 2, 5 subsets contain 1, and 1 subset contain 0 (the null or empty set, Φ) So the row counts all possible subsets of a set with 5 elements, and this totals 2 5. In general, a set with n elements has 2 n possible subsets, including itself and the null set. The set of all subsets is called the power set. Binary numbers A subset corresponds to a binary number, with a 1 for an element in the subset, and 0 for one excluded. For example for the set {A,B,C,D,E} the subset {A,B,C,D} corresponds to the base 2 number 11110 (Can you see where this is going?) So the row for n=5: 5 1 5 10 10 5 1 says: There is just 1 5bit number with 5 1's : 11111 There are 5 5bit numbers with 4 1's : 10111, 11011, 11101, 11110 and 01111 (usually written 1111). There are 10 5bit numbers with 3 1's, and so on - 10 with 2 1s, 5 with 1 1 (10000, 01000, 00100, 00010 and 00001) and 1 with 0 is : 00000 The total count is 2 5. So in 5 bits, we have 2 5 = 64 different bit patterns, from 00000 to 11111, which is decimal 0 to 63. Trinomials A trinomial has the form (x+y+z) n so while a binomial has 2 terms, a trinomial has three.
For example (x+y+z) 4 Correspondingly the coefficients of the terms are trinomial coefficients. Try and explore these. The binomial coefficients can be set out in a 2 dimensional structure, Pascals Triangle. The equivalent for trinomials is a 3D structure, Pascal's Pyramid. Multinomials But why stop at 3? A multinomial is (a 1 +a 2 +a 3..a m ) n These have multinomial coefficients, which can be arranged in m dimensions. Try investigating this. Bernoulli Processes Suppose we are playing a game using a fair six-sided die, and we want to get a six. This an example of a Bernoulli trial. We do something (roll the die). We get a 'success' ( a six) or a 'failure' (not a six). In this case the probability of success is 1/6, and the probability of failure is 5/6. Suppose we roll the die 4 times. We might get 0,1,2,3 or 4 sixes. What is the probability of getting 1 (and no more) sixes? This is a Bernoulli process - a repeated Bernoulli trial, where each trial is independent. In each trial, we have a probability of success p, and probability of failure q, and since there are only two possible outcomes, q = 1-p. Suppose we draw a probability tree of rolling a die. It's not very interesting: Success p=1/6 Start Failure q=5/6 Suppose we roll the die twice:
S p 2 S p F pq Start S pq F q F q 2 So we can get 2 sixes, with probability p 2. Or no sixes = 2 failures, probability q 2. Or we might get just one six but we can do this in 2 ways - six and not six, or not six followed by a six. So there are two outcomes giving us 1 six, each with probability pq, and they are exclusive, so the probability of 1 six is 2pq. We have p 2, and 2pq, and q 2 - is that familiar? Roll the die three times: S p 3 S p 2 F p 2 q S p F pq S p 2 q F pq 2 Start F q S pq F q 2 S p 2 q F pq 2 S pq 2 F q 3 The sequence which gives one six (and no more) are in red. Each outcome has probability pq 2 - since we need one success and 2 fails. But we can get that in 3 ways - SFF, FSF or FFS. So the probability of exactly one six is 3pq 2. Similarly we can get 3 sixes, with probability p 3, no sixes with probability q 3, and 2 sixes with probability 3p 2 q. So the situation is described as an expansion of a binomial:
3 sixes 2 sixes 1 six No sixes (p+q) 3 = p 3 + 3p 2 q + 3 pq 2 + q 3 So we can generalise this. Suppose we have n Bernoulli trials, each with probability of success p (and failure q = 1-p) Then the probability of exactly m successes out of the n is ( n m) pm (1 p) n m For example, if we toss a fair coin 20 times, what is the probability of getting exactly 10 heads? This is n=20, m=10, p=0.5, so the answer is ( 20 10) 0.510 (1 0.5) 10 = 184756 X 9.5367e-07 = 0.176 We can work this out for other numbers: Heads Probability 0 9.53E-007 1 1.90E-005 2 1.80E-004 3 1.08E-003 4 0.0046 5 0.0148 6 0.037 7 0.074 8 0.12 9 0.16 10 0.176 11 0.16 12 0.12 13 0.074 14 0.037 15 0.0148 16 0.0046 17 1.08E-003 18 1.80E-004 19 1.90E-005 20 9.53E-007 So as expected, the highest probability is for 10 heads. These are exclusive events, so the probability of 9, 10 or 11 heads = 0.16 + 0.176 + 0.16 = 0.496