ECON 214 Elements of Statistics for Economists 2016/2017 Topic Probability Distributions: Binomial and Poisson Distributions Lecturer: Dr. Bernardin Senadza, Dept. of Economics bsenadza@ug.edu.gh College of Education School of Continuing and Distance Education 2014/2015 2016/2017
Overview We extend our discussion of probability by considering probability distributions. The outcome of a probability experiment is an random variable. Each outcome of an experiment has an associated probability. The probability distribution of a random variable lists all the possible values of the random variable and their corresponding probabilities. Just as with a frequency distribution, we are able to calculate the mean (expected value), variance and standard deviation of the random variable from a probability distribution. We shall distinguish between discrete and continuous probability distributions and then discuss two standard discrete probability distributions - the Binomial and Poisson distributions. Slide 2
Overview cont d At the end of this lecture, the student will be able to Distinguish between discrete and continuous probability distributions Calculate the mean, variance, and standard deviation of a discrete probability distribution. Compute probabilities using the binomial distribution Compute probabilities using the Poisson distribution Use the Poisson distribution to approximate the binomial distribution Slide 3
Random variables Most statistics (e.g. the sample mean) are random variables. A random variable is a numeric event whose value is determined by a chance process or an experiment. The event should not be under the control of the observer; i.e., the value of the random variable is unknown before the experiment is carried out. Slide 4
Random variables Example 1: Suppose we roll two dice and take the sum of the numbers showing up. This sum is clearly a random variable because its value is determined by chance. Such an experiment produces 11 possible values. What are these values? Slide 5
Random variables They are 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, and 12. Slide 6
Random variables Example 2: Consider a random experiment in which a coin is tossed three times. Let X be the number of heads. Let H represent the outcome of a head and T the outcome of a tail. The sample space for such an experiment will be: TTT, TTH, THT, THH, HTT, HTH, HHT, HHH. Slide 7
Random variables Thus the possible values of X (number of heads) are x = 0,1,2,3. The outcome of zero heads occurred once. The outcome of one head occurred three times. The outcome of two heads occurred three times. The outcome of three heads occurred once. From the definition of a random variable, X as defined in this experiment, is a random variable. Many random variables have well-known probability distributions associated with them. To understand random variables, we need to know about probability distributions. Slide 8
Discrete and continuous random variables A discrete random variable is a variable that can assume only certain clearly separated values resulting from a count of some item of interest (countably finite or infinite). Example: Let X be the number of heads when a coin is tossed 3 times. Here the values for X are: X = 0,1,2,3. Slide 9
Discrete and continuous random variables A continuous random variable is a variable that can assume one of an infinitely large number of values. Assumes any value within a designated range of values. Example: Height of a student in this class. Slide 10
Discrete and continuous random variables Some standard probability distributions are Binomial distribution (discrete) Poisson distribution (discrete) Normal distribution (continuous) Slide 11
Discrete and continuous random variables Binomial - when the underlying probability experiment has only two possible outcomes (e.g. tossing a coin). Poisson - for rare events, when the probability of occurrence is low. Normal - when many small independent factors influence a variable (e.g. IQ, influenced by genes, diet, etc.) Slide 12
Discrete probability distributions Each value of a random variable has an associated probability. The probability distribution of a random variable lists all the possible values of the random variable and their corresponding probabilities. If P (X) is the probability that X is the value of the random variable, then P( X) 1 Slide 13
Discrete probability distributions Suppose we toss a coin three times and want to observe the number of heads. The possible values are 0, 1, 2, 3. What is the probability distribution of the number of heads? Slide 14
Discrete probability distributions No. of heads X 0 1 2 3 Frequency 1 3 3 1 Probability P (X) 1/8 =.125 3/8 =.375 3/8 =.375 1/8 =.125 8/8 = 1.00 Slide 15
Expected value of discrete random variable The mean of a random value is called its expected value. For discrete random variables, it is the weighted mean of all possible values of the random variable, with weights being the probabilities E( X ) XP( X ) X Slide 16
Expected value of discrete random variable For the three tosses of a coin, we have E( X ) XP( X ) 0(.125) 1(.375) 2(.375) 3(.125) 1.5 The expected value is not the value we expect on any single toss Rather it is the long run average value (when the experiment is done a large number of times) Slide 17
Expected value of discrete random variable The variance is given by [ ( )] ( ) 2 2 X X E X P X Or for computational ease, we use X X P( X ) [ XP( X )] 2 2 2 To obtain standard deviation, we take the square root of the variance. Slide 18
Expected value of discrete random variable X P(X) X P(X) X- E(X) 0.125 0-1.5 [X-E(X)] 2 [X-E(X)] 2 P(X) X 2 X 2 P(X) 2.25.2813 0 0 1.375.375 -.5.25.0938 1.375 2.375.75.5.25.0938 4 1.5 3.125.375 1.5 2.25.2813 9 1.125 1.5.7502 3.0 Slide 19
Expected value of discrete random variable X X P ( X ) [ X P ( X )] 2 2 2 = 3.0 (1.5) 2 =.75 Standard deviation =.87 Slide 20
Exercise 1 The Managing Director of Perfect Painters, a painting firm in Accra, has studied his records for the past 20 weeks and reports the following number of houses painted per week. Compute the mean and variance of the number of houses painted per week. # o f Ho u s e s We e k s P a i n t e d 10 5 11 6 12 7 13 2 Slide 21
Solution to Exercise 1 From the Table on the previous slide, 10 houses were painted a week in 5 out of the 20 weeks, so the probability that 10 houses are painted in a week is 5 weeks divided by 20 weeks or 5/20 =.25 11 houses were painted in 6 of the 20 weeks so the probability that 11 houses are painted in a week is 6/20 =.30 And so and so forth The probability distribution so obtained is in the table on the next slide. Slide 22
Solution to Exercise 1 The probability distribution Number of houses Probability, P(X) painted, X 10.25 11.30 12.35 13.10 Total 1 Slide 23
Solution to Exercise 1 Mean number of houses painted per week: E( X ) [ XP( X )] (10)(.25) (11)(.30) (12)(.35) (13)(.10) 11.3 Variance of the number of houses painted per week: [( X ) P ( X )] 2 2.4 2 2 5.0 2 7 0.1 7 1 5.2 8 9 0.9 1 Slide 24
The Binomial distribution The binomial distribution (BD) is applied when; Only two mutually exclusive outcomes are possible in each trial (success or failure). The data collected are the results of counts. The outcomes in the series of trials are independent. The probability of success, denoted p, in each trial remains constant from trial to trial. Slide 25
The Binomial distribution The objective of using the BD is to determine the probability values for various possible number of successes (X); given the number of trails (n) and the known (and constant) probability of success (p). Slide 26
The Binomial distribution Consider five tosses of a coin. Recall from last lecture that, we can write the probability of 1 Head in 2 tosses as the probability of a head and a tail (in that order) times the number of possible orderings (# of times that event occurs). P (1 Head) = ½ ½ 2C1 = ¼ 2 = ½ We can apply same technique to calculate the probability for the number of heads in five tosses of a coin. Slide 27
The Binomial distribution P (X Heads in five tosses of a coin) P(X = 0) = (½) 0 (½) 5 5C0 = 1 / 32 1 = 1 / 32 P(X = 1) = (½) 1 (½) 4 5C1 = 1 / 32 5 = 5 / 32 P(X = 2) = (½) 2 (½) 3 5C2 = 1 / 32 10 = 10 / 32 P(X = 3) = (½) 3 (½) 2 5C3 = 1 / 32 10 = 10 / 32 P(X = 4) = (½) 4 (½) 1 5C4 = 1 / 32 5 = 5 / 32 P(X = 5) = (½) 5 (½) 0 5C5 = 1 / 32 1 = 1 / 32 Slide 28
The Binomial distribution To construct a binomial distribution, let n be the number of trials x be the number of observed successes p be the probability of success on each trial the formula for the binomial probability distribution is: n! P( x) p x (1 p) x!( n x)! n x Slide 29
The Binomial distribution When the BD is used, it is typically because we wish to determine the probability of: X or more successes [ P(X x i ) ] or X or fewer successes [P (X x i )] If the individual probabilities to be summed is large, it is easier to use the complement rule. Example: P (X x i ) = 1 - P(X < x i ) Slide 30
The Binomial distribution Suppose the probability is.05 that a randomly selected student of the University of Ghana owns a car. What is the probability of observing two or more student carowners in a random sample of 20 students? P(X 2) = P(X=2) + P(X=3) +..+ P(X=20) P(X 2) = 1 P(X<2) = 1- P(X=0, 1) = 1 [P(X=0) +P(X=1)] Now P(X=0) = 20 C 0 (.05) 0 (.95) 20 =.3585 P(X=1) = 20 C 1 (.05)(.95) 19 =.3774 So P(X 2) = 1 (.3585 +.3774) =.2641 Slide 31
Exercise 2 The Ministry of Employment reports that 20% of the labour force in Ghana is unemployed. From a sample of 14 members of the labour force, calculate the probability that three or more are unemployed. Slide 32
Solution to Exercise 2 P(X 3) = P(X=2) + P(X=3) +..+ P(X=14) P(X ) = 1 P(X<3) = 1- P(X=0, 1,2) = 1 [P(X=0) + P(X=1) + P(X=2)] = 1 (.044 +.154 +.250) =.552 Slide 33
The Binomial distribution The mean of the binomial random variable is E(X) = np And the variance is σ 2 = np(1-p) The standard deviation is the square root of the variance. Slide 34
The Poisson distribution It is a sampling process in which events occur over time or space. The Poisson distribution is used to describe a number of processes or events such as The distribution of telephone calls going through a switch board The demand of patients for service at a health facility The arrival of vehicles at a tollbooth The number of accidents occurring at a road intersection, etc. Slide 35
The Poisson distribution Characteristics defining a Poisson random variable are The experiment consists of counting the number of times a particular event occurs during a given time interval The probability that the event occurs in one time interval is independent of the probability of the event occurring in another time interval The mean number of events in each unit of time is proportional to the length of the time interval. Slide 36
The Poisson distribution The probability that the Poisson random variable will assume the value X is given by P x x e x! For X = 0, 1, 2,3. λ is the mean of the distribution (mean number of events occurring in a given unit of time) e is approximately 2.7183 and is the base of the natural logarithms. Slide 37
The Poisson distribution Suppose an average of 2 calls per minute are received at a switchboard during a designated time interval, find the probability that exactly 3 calls are received in a randomly sampled minute. Here, X = 3 and λ = 2. So 3 2 2 e 8(.1353) P x 3.1804 3! 6 Slide 38
The Poisson distribution As with the BD, the PD typically involves determining the probability of X or more number of events or X or fewer number of events. To calculate, we sum the appropriate probability values. The use of the complement rule may also come in handy. Slide 39
The Poisson distribution For example, we may want to calculate the probability of receiving 3 or more calls in a three-minute interval. That is P(X 3) = P(X=3) + P(X=4) + Using the complement rule, we have P(X 3) = 1 - P(X<3) = 1 [P(x=0) + P(x=1) + P(x=2)] From our proposition 3, we know the mean number of occurrences is proportional to the length of the time interval. Slide 40
The Poisson distribution So if we expect a mean of 2 calls per minute, in three minutes we must expect 6 calls. For an interval of 30 seconds, the mean number of calls is one. The probability of 5 calls in a three-minute interval implies λ = 6, so P(X=5 / λ=6) =.1606 The probability of no calls in an interval of 30 seconds implies that λ = 1, so P(X=0 / λ=1) =.3679 Slide 41
The Poisson distribution The expected value and variance for a Poisson random variable are both equal to the mean number of events for the time interval of interest. E (X) = λ Var (X) = λ The standard deviation is the square root of the variance. Slide 42
Exercise 3 A private clinic specializes in caring for minor injuries, colds, and flu. For the evening hours of 6-10 PM the mean number of patient arrivals is 4 per hour. What is the probability of 4 arrivals in an hour? Slide 43
Solution to Exercise 3 x e 4 4 e 4 P x 4.195 x! 4! Slide 44
Poisson approximation of Binomial When the probability of occurrence (success) is very small (P<.05). And the number of trials is large (n>20). So that λ = np and we apply the Poisson formula instead of the Binomial formula. Some texts say use the Poisson in place of the Binomial when np < 5. Slide 45
Poisson approximation of Binomial A manufacturer claims a failure rate of 0.2% for its hard disk drives. In an assignment of 500 drives, what is the probability that, none are faulty, one is faulty, two are faulty? On average, 1 drive (0.2% of 500) should be faulty, so λ = np = 1. Slide 46
Poisson approximation of Binomial The probability of no faulty drives is 0 1 1 e P x 0 0.368 0! The probability of one faulty drive is 1 1 e P x 1 0.368 1! The probability of two faulty drives is 2 1 1 e P x 2 0.184 2! Slide 47
Exercise 4 An experienced invoice clerk makes an error once in every 100 invoices, on average. What is the probability of finding a batch of 100 invoices with at least three errors? Calculate your answer using the Poisson approximation of the Binomial. Slide 48