Binomial Random Variables Binomial Distribution STAT Tom Ilvento In many cases the responses to an experiment are dichotomous Yes/No Alive/Dead Support/Don t Support Binomial Random Variables When our focus is conducting an experiment n times and observing the number x of times that one of the two outcomes occurs It is a Binomial Random Variable We can exploit this by using known formulas for a probability distribution Examples of Binomial Random Variables, people are polled in a telephone survey and asked if they would vote for George W. Bush The responses are Yes () or No () The proportion saying yes is designated as p (-p) is the proportion saying No Binomial Random Variable Yes it is a binomial random variable Conduct an experiment, times and observe the number x of timesthat Yes occurs Characteristics of a Binomial Random Variable p 79 The experiment consists of n identical trials There are only two outcomes on each trial. Outcomes can be denoted as S for Success F for Failure
Characteristics of a Binomial Random Variable p 79 The probability of S (success) remains the same from trial to trail Denoted as p the proportion The probability of F (failure) Denoted as q q=(-p) The trials are independent of each other The binomial random variable x is the number of Successes in n trials Example : Marketing example p. 79 Marketing survey of randomly chosen consumers Record their preferences for a new and an old diet soda ask them to choose their preference Let x be number of who choose the new brand Is this a binomial random variable? Conduct an experiment times and observe the number x of times that Yes occurs Fitness Example on page 8 Heart Association says only % of adults over can pass the fitness test Suppose people over are selected at random Let x be the number who pass the minimum requirements Find the probability distribution for x Conduct an experiment times and observe the number x of times that pass occurs How to solve the fitness problem. List the events. List the sample points that refer to that event. Calculate the probabilities p =. and q = (. -.) =.9 Event x Sample Points FFFF (.9)(.9)(.9)(.9) =.656 I multiply through on the probabilities because Each trial is independent of the others Solve for Each Event Event x Pass Fail Pass Fail Pass Fail Pass Notation FFFF SFFF FSFF FFSF FFFS SSFF SFSF SFFS FSSF FSFS FFSS SSSF SFSS SSFS FSSS SSSS (.9)(.9)(.9)(.9) =.656 [(.)(.9) ] =.96 6[(.) (.9) ] =.86 [(.) (.9)] =.6 (.)(.)(.)(.) =. Fitness Example (page 8-8) When x = P =.656 Distribiution of X When x = One Pass P =.96 When x = Two pass P =.86.7.6.5. When x= Three. pass P =.6. When x= Four pass P =..
Fitness Example (page 8-8) Find the probability that none of the adults pass the test P(x=) = Find the probability that of adults pass the test P(x=) = When we have many trails the formulas get complicated We can also use the binomial probability distribution formula Using factorial notation = n(n-)(n-) (n-(n-)) 5! = 5xxxx = The formula for any x in n trials is: n! x n x Binomial Distribution Formula n! x n x aka = n x p x q n x Most calculators will do all or part of this become familiar before trying it out What defines a binomial probability distribution? p = of a success on a single trial q= (-p) probability of failure n= number of trials x = number of successes in n trials n! x n x For x= in the fitness example! ) = (.)!( )! (.9) = (.)(.9) ( )() = (.9) =.6 6 This matches the number we generated the other way Fitness Example Table Event x Notation FFFF SFFF FSFF Pass Fail FFSF FFFS SSFF SFSF Pass Fail SFFS FSSF FSFS FFSS SSSF SFSS Pass Fail SSFS FSSS SSSS Pass.656.96.86.6.
For x= in fitness example! ) = (.)!( )! (.9) = (.)(.8) ( )( ) = (.8) =.86 This matches the number we generated the other way Fitness Example Table Event x Notation FFFF SFFF FSFF Pass Fail FFSF FFFS SSFF SFSF Pass Fail SFFS FSSF FSFS FFSS SSSF SFSS Pass Fail SSFS FSSS SSSS Pass.656.96.86.6. Mean of a Binomial Random Variable Since a binomial is only a dichotomy, the formulas for the mean and the standard deviation will simplify From := xa To := np Standard Deviation of a Binomial Random Variable From F = (x-:) To F = npq The standard deviation is then σ = npq Mean and Standard Deviation for Fitness Example Heart Association says only % of adults over can pass the fitness test Thus the proportion passing was estimated at., and n for the problem was people := np = F = npq = F Fitness Example Table for mean and variance Event x Notation FFFF SFFF FSFF Pass Fail FFSF FFFS SSFF SFSF Pass Fail SFFS FSSF FSFS FFSS SSSF SFSS Pass Fail SSFS FSSS SSSS Pass.656.96.86.6.
I could have solved for the mean using the old formula To solve for the mean I would have: E( = ()(.656) + ()(.96) + ()(.86) + ()(.6) + ()(.) E( =. Binomial approach E( = n p = E( = x n i= i x i ) = µ I could have solved for the Variance using the old formula To solve for the variance I would have: E(x-:) = ( -.) (.656) + (-.) (.96) + (-.) (.86) + (-.) (.6) + (-.) (.) E(x-:) =.6 Binomial approach E( = n p q = E n [( x µ ) ] = ( xi µ ) x i ) = σ i= Nitrous Oxide Example Suppose we were recording the number of dentists that use nitrous oxide (laughing gas) in their practice We know that 6% of dentists use the gas. p =.6 and q =. Let X = number of dentists in a random sample of five dentists use use laughing gas. n = 5 Nitrous Oxide Example We said the probability that a dentist uses nitrous oxide is.6 How would you assign probabilities to the values x could take when we randomly select five dentists? X 5 Solve for Each Event Nitrous Oxide Example Event x Pass Fail Pass Fail There FFFFF Notation SFFFF FSFFF FFSFF FFFSF FFFFS SSFFF SFSFF SFFSF SFFFS FSSFF FSFSF FSFFS FFSSF FFSFS FFFSS Is (.)(.)(.)(.)(.) =. [(.6)(.)(.)(.)(.)] x5 =.768 [(.6)(.6)(.)(.)(.)] x =. More!! X P(X) So what other way can we get to the probabilities?..768. n!.56.59 x n x 5.778 5
Nitrous Oxide Example Solve for x=) n! x n x.. Distribution of the Discrete Variable X Distribution of X E(X) = F =.998 F =.955 X).. 5 Number of Dentists X).... Distribution of the Discrete Variable X Distribution of X 5 Number of Dentists E(X) = F =.998 F =.955 E(X) = np=5(.6) = F = npq = 5(.6)(.) =. F =.95 Example: Seedling Survival An agronomist knows from past experience that 8% of a citrus variety seedling will survive being transplanted. If we take a random sample of 6 seedlings from current stock, what is the probability that exactly seedlings will survive? Example: Seedling Survival For the problem we can calculate p = q = : = np = F = npq = F = Example: Seedling Survival that exactly survive is 6! 6 ) = (.8) (.)!(6 )! 6
Example: Seedling Survival that exactly survive is that exactly survive is 6! 6 ) = (.8) (.)!(6 )! 6 5 = (.96)(.) ( )( ) 7 = (.6) =.6 8 Solve for Each Event Look at the Cumulative Probabilities Event x = All fail x = One pass x = Two pass x = Three pass x = Four pass x = 5 Five pass x = 6 Six pass..5.5.8.6.9.6 Event x = All fail x = One pass x = Two pass x = Three pass x = Four pass x = 5 Five pass x = 6 Six pass..5.5.8.6.9.6 Cumulative p..6.7.99.5.78. Citrus Example Mean = 6(.8) =.8 Std Dev =.98 It Makes Sense! Our expectation is that most seedlings will survive (i.e..8 of 6) Look at the cumulative probability.... Distribiution for Citrus Example 5 6 Move to the Binomial Table We can also use a table to help Appendix A, Table II contains cumulative probabilities for n= 5, 6, 7, 8, 9,, 5,, and 5 Each table lists values of p across the top P =.,.5,.,.,.,,.95,.99 k = # of successes 7
Binomial Table n=6 Binomial Table k P..9.999...5.75.967.998..8...7.99 NOTE: The table is cumulative binomial probabilities, cumulative up to an including the value for k This means to find exact probabilities you might have to subtract two table values...5 5...78 Binomial Probabilities Using the Table for Citrus Example We said the probability that survive is.6 From the Table Cumulative up to is.5 Subtract the probabilities for up to (.99).5 -.99 =.6 You have to be careful using the Table! Binomial Formula using Excel In Excel, the formula for the Binomial Distribution function is: BINOMDIST(X,N,P,cumulative) X is the number of successes N is the number of independent trials P is the probability of success on each trial Cumulative is an argument Entering TRUE gives a cumulative probability up to and including X successes Entering FALSE gives the exact probability of X successes in N trials Binomial Formula using Excel Look at the Citrus Seedling Table For our example of citrus plants BINOMDIST(,6,.8,TRUE) cumulative probability up to and including successes.696 BINOMDIST(,6,.8,FALSE) the exact probability of X successes in N trials =.56 Event x = All fail x = One pass x = Two pass x = Three pass x = Four pass x = 5 Five pass x = 6 Six pass..5.5.8.6.9.6 Cumulative p..6.7.99.5.78. 8
Excel Binomial Distribution File p =.8 q =. X X) Cum X) n = 6..6.6.5.6 Mean.8.56.696 Variance.96.89.9888 Std Dev.98.576.6 5.9.7786 6.6. 7 #NUM! #NUM! 8 #NUM! #NUM! 9 #NUM! #NUM! #NUM! #NUM! The Rare Event Approach What if we had 6 seedlings selected randomly and all of them died? Given p=.8, this would be a very rare event P(x=) =. Was this just by chance???? Formula Cum = FALSE.56 X successes = Formula Cum= TRUE.696 Problem.5 in the book A study in the American Journal of Public Health found that 8% of female Japanese students from heavy-smoking families showed signs of nasal allergies Consider a random sample of 5 female Japanese students exposed daily to heavy smoking What is the probability that fewer than of the students will have nasal allergies? Answer to Problem P(x<) = Poisson Distribution What is the probability that more than 5 of the students will have nasal allergies? P(x>5) = Applies to situations where we describe the number of events occurring in a specific time period or in a specific area = x λ e x! λ Where 8 = : e = natural logarithm =.78 9