Favorite Distributions Binomial, Poisson and Normal Here we consider 3 favorite distributions in statistics: Binomial, discovered by James Bernoulli in 1700 Poisson, a limiting form of the Binomial, found by Poisson in1837 Normal, discovered by DeMoivre in 1733; often called the Gaussian because Gauss showed (around 1809) that errors in astronomical observations follow it with much accuracy Binomial Distribution. Review Toss a coin n times with probability p of head (or do any experiment involving n successive trials with probability p of success) This discrete distribution has pdf, for k=0,1,,3, n! k n k px ( k) = p q, where q= 1 p k!( n k)! mean= E(X) = np variance=σ = V(X)=npq q p 1 6pq skewness = kurtosis = npq npq 1
The histogram for binomial distribution n=6, p=. We plotted this using Mathematica <<Statistics`DiscreteDistributions` <<"BarCharts`";<<"Histograms`";<<"PieCharts`" ; bdist = BinomialDistribution[6,.] v=table[pdf[bdist,x],{x,0,6}] BarChart[v] The histogram for binomial distribution n=,50 p=.3 We plotted this using Mathematica ListPlotATable@8k, PDF@BinomialDistribution@50, 0.3D, kd<, 8k, 0, 50<D, Filling -> AxisE
Bernoulli Trials lead to Binomial Distribution 1) The result of each trial may be either success or failure. ) The probability p of success is the same in each trial. 3) The trials are independent; i.e., the outcome of 1 trial does not affect later outcomes. It is rather difficult to find frequency distributions which follow the binomial with great accuracy since most large bodies of data would not have constant probabilities. Poisson Distribution Limiting case of Binomial as n with np=λ constant. n λ λ Using the fact that lim 1 = e we see n n k n! k n k λ λ lim p (1 p) = e k!( n k)! k! n np = λ 3
0.0 The binomial pdf with n=30 and p=.13 Mathematica command to do this 0.15 ListPlot[Table[{k, PDF[BinomialDistribution[30,.13], k]}, {k, 0, 30}],PlotStyle >{Thick,PointSize[Large]}] 0.10 0.05 30*.13 = 3.9 5 10 15 0 5 30 0.0 0.15 Poisson pdf with λ=3.9 Mathematica command ListPlot[Table[{k, PDF[PoissonDistribution[3.9],k]}, {k, 0, 30}],PlotStyle >{Thick,PointSize[Large]}] 010 0.10 The distributions look alike even though n is not very large. 0.05 5 10 15 0 5 30 Properties of the Poisson Distribution It is a discrete distribution; i.e., k= 0 p () k = 1, if p () k = e Mean = E(X) = λ and Variance =Var(X)=λ. Poisson Model # occurrences of something in time 0 t T Divide up the interval [0,T] into n subintervals of length T/n Poisson means probability or more events in 1 subinterval is 0 events are independent probability event occurs in subinterval p = constant X X k λ λ X = total number occurrences in time T, λ=rate of occurrence E(X)=np=λT and so p=λt/n pdf is ( λt k ) p ( k) = e λt X k! k! 4
Example. 4..13. New Zealand, the first country to let women vote. Over 113 years, we get the table below. p(k)=poisson with λ=.36831858. The agreement is pretty good once the last 3 rows are combined. k=yearly ynumber of frequency with k=,3,4, p(k) with k=,3,4, countries letting women vote frequency proportion p(k) added added 0 8 0.76 0.696 1 5 0.1 0.5 4 0.035 0.046 0.053 0.05 3 0 0.000 0.006 4 0.018 0.001 total of frequency 113 mean of frequency 0.36831858 Intervals Between Poisson Distributed Events are Exponentially Distributed Why? We follow the discussion in Bulmer, Principles of Statistics, p. 99. Consider radioactive disintegration, for example. If, on average, there are λ disintegrations per second, then the number of disintegrations in t seconds is about λt. Thus, if disintegrations are Poisson, they will have mean λt. So the probability of no disintegrations in t seconds is e λt. This is the same as the probability that we must wait more than t seconds before the 1 st disintegration occurs. Write Y for the waiting time before the occurrence of the 1 st disintegration, then Prob(Y>t)= e λt λ. Thus Prob(Y t)=1 e λt λ. So, taking the derivative of this last function with respect to t, gives us the pdf of Y which is λe λt = the pdf of the exponential distribution! QED 5
Example from our text page 91. The eruptions of Mauna Loa the 14,000 ft volcano in Hawaii with a famous telescope have been measured since 183. During the time from 183 1950, the rate of occurrence of eruptions was λ.07 per month. So our exponential pdf is p(y)=.07 e.07y. The agreement with the table is pretty ygood. distribution of time intervals between eruptions of Mauna Loa (183 1950) y interval in months frequency density p(y) 1.000 [0,0) 13.000 0.018 0.07.000 [0,40) 9.000 0.013 0.016 3.000 [40,60) 5.000 0.007 0.009 4.000 [60,80) 6.000 0.008 0.005 5.000 [80,100) 0.000 0.000 0.003 6.000 [100,10) 1.000 0.001 0.00 f( z) Normal Distribution 0.4 1 z e ππ = 0.3 DeMoivre discovered that when p=1/, the binomial distribution was closely approximated by the normal distribution for large n. In fact, this works for any value of p and in even greater generality (the central limit theorem). It is somewhat surprising as the binomial is skewed when p is not ½. The skewness of the binomial pdf disappears when n gets large. 0. 0.1-4 - 4 6
Plots of Normal Densities with Mean μ=0, Standard Deviations σ=1, 1/, and Mathematica Command Plot[{PDF[NormalDistribution[0,1],x],PDF[NormalDistribution[0,.5],x], PDF[NormalDistribution[0,],x]},{x, 5,5},Filling >Axis,PlotRange >Full] Full] small std dev 1 σ π e 1 ( ) x μ σ large std dev Properties of the Normal Distribution 1 σ π The normal density is a pdf meaning that the integral over the real line is 1. The mean (also median and mode) = μ and the standard deviation = σ. e 1 ( ) x μ σ If X is normal with mean μ and standard deviation σ, then Y=a+bX is normal with mean a+bμ and standard deviation b σ. Proofs 1) Square the integral and change to polar coordinates ) For the mean use substitution, for the variance, use integration by parts to reduce to 1). 3) Use theorems from chapter 3, pages 187 and 197. 7
Example. The IQ measured by the Stanford Binet test is approximately normally distributed with mean 100 and standard deviation 16. Find the probability of having an IQ 10. Let X be the r.v. for IQ. To use tables in our book, we must change to the standard normal distribution using the Z transform Z=(X μ)/σ. Then the cumulative probability distribution is X F(x)=Prob(X x)= Pr μ x μ x μ =Φ σ σ σ Here Φ(x) =cumulative probability density for the standard normal x distribution: 1 t Φ ( x) = e dt π There are tables for this function in the back of our text. Of course programs like Mathematica will compute it for you. One finds using table, p. 85: Prob(X 10)= X 100 10 100 Pr Φ( 1.5 ).8944 16 16 In Mathematica you don t need to normalize. Just say: CDF[NormalDistribution[μ,σ]]. In Mathematica you don t need to normalize. Mathematica computes it in terms of the error function erf(x). The cdf for a normal rv X with meanμ and standard deviation σ is: 1 x μ F( x) = erf + 1, where σσ p erf HzL = Ÿ 0 z e -t dt So to do our IQ test problem in Mathematica, we input: CDF[NormalDistribution[100,16],10] and Mathematica responds: 1/ (1+Erf[5/(4 )]) We want a number so we write: CDF[NormalDistribution[100,16],10]//N 16] 10]//N and Mathematica gives us: 0.89435 8
The Central Limit Theorem. Suppose X 1, X,.., X n, is an infinite sequence of independent random variables, each having the same pdf. Suppose that the mean of the pdf is μ and the standard deviation is σ (both finite). Then for any real number a,b, we have b X1 + + X 1 z n nμ lim a b = e dz n σ n π a This theorem is due to Lindeberg. Others had proved it under more restrictive conditions. There is a proof using moment generating functions. See our text p. 341 or see Bulmer, Principles of Statistics, p. 116. For a proof using Fourier analysis, see my book, Harmonic Analysis on Symmetric Spaces and Applications, I, pp. 6 7. See Feller, An Introduction to Probability Theory & its Applications, I, II. The Continuity Correction Let s take another look at the binomial distribution and its approximation by the normal distribution. Let the binomial random variable have n=5 and p=1/; i.e 5 flips of a fair coin. We can compute the following probability using Mathematica: Pr(X 14) =.7878. CDF[BinomialDistribution[5,1/],14]//N To draw picture: <<Statistics`DiscreteDistributions` <<"BarCharts`"; <<"Histograms`"; <<"PieCharts`" bdist = BinomialDistribution[5, 0.5] v=table[pdf[bdist,x],{x,0,5}]; BarChart[v] the bars of the histogram are centered at numbers so that we ll get a better estimate if we go up to 14.5 rather than 14 on the normal curve 9
Compare binomial with a normal random variable Y with the same mean μ=np=5/=1.5, standard deviation σ=(np(1 p)) 1/ =(5/4) 1/ =5/=.5 Plot[PDF[NormalDistribution[1.5,.5],x], {x,0,5},filling >Axis,PlotRange >Full] Now to compute the probabilities. If you want to use the table in the back of our text, p. 85, use Z=(Y 1.5)/.5: noting that (14 1.5)/.5=.6 and the continuity correction version (14.5 1.5)/.5=.8 Pr(Z.6).757 Pr(Z.8).7881 or use Mathematica without going to the standard normal CDF[NormalDistribution[1.5,.5],14]//N 0.75747 CDF[NormalDistribution[1.5,.5],14.5]//N 0.788145 The exact answer for the binomial distribution was Pr(X 14) =.7878. So the continuity correction was much closer. binomial n=5, p=.5 normal with same mean, standard deviation 10
Claim: The normal is only a good approximation to the binomial distribution when np and n(1 p) 5. In our case above, 5(.5)=1.5. More examples. Example 1. n=3 n3, p=.1 Example. n=10, p=.1 Example 3. n=50, p=.1 What s so great about the normal distribution? It is easy to compute. By table with the Z transform or by computer with Mathematica or whatever. A f l b f i d d Any sum of a large number of independent identically distributed random variables converges to it. 11