Bin(20,.5) and N(10,5) distributions

Similar documents
Discrete Probability Distributions

MAS187/AEF258. University of Newcastle upon Tyne

Chapter 5: Probability models

Session Window. Variable Name Row. Worksheet Window. Double click on MINITAB icon. You will see a split screen: Getting Started with MINITAB

Discrete Random Variables and Their Probability Distributions

7. For the table that follows, answer the following questions: x y 1-1/4 2-1/2 3-3/4 4

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Section 7.5 The Normal Distribution. Section 7.6 Application of the Normal Distribution

STAT 157 HW1 Solutions

This is very simple, just enter the sample into a list in the calculator and go to STAT CALC 1-Var Stats. You will get

Sampling Distributions

. 13. The maximum error (margin of error) of the estimate for μ (based on known σ) is:

GETTING STARTED. To OPEN MINITAB: Click Start>Programs>Minitab14>Minitab14 or Click Minitab 14 on your Desktop

Homework: Due Wed, Feb 20 th. Chapter 8, # 60a + 62a (count together as 1), 74, 82

Statistics and Probability

CHAPTER 5 SAMPLING DISTRIBUTIONS

Statistics 431 Spring 2007 P. Shaman. Preliminaries

Chapter 6.1 Confidence Intervals. Stat 226 Introduction to Business Statistics I. Chapter 6, Section 6.1

2011 Pearson Education, Inc

Homework: Due Wed, Nov 3 rd Chapter 8, # 48a, 55c and 56 (count as 1), 67a

Section Sampling Distributions for Counts and Proportions

Lab#3 Probability

Business Statistics 41000: Probability 4

Commonly Used Distributions

The Normal Probability Distribution

The Binomial Distribution


The graph of a normal curve is symmetric with respect to the line x = µ, and has points of

Binomial Distribution. Normal Approximation to the Binomial

Lecture 23. STAT 225 Introduction to Probability Models April 4, Whitney Huang Purdue University. Normal approximation to Binomial

STA 220H1F LEC0201. Week 7: More Probability: Discrete Random Variables

STATISTICAL DISTRIBUTIONS AND THE CALCULATOR

Chapter 8 Estimation

Tutorial 11: Limit Theorems. Baoxiang Wang & Yihan Zhang bxwang, April 10, 2017

Probability Distributions II

ECON 214 Elements of Statistics for Economists 2016/2017

Section Distributions of Random Variables

Chapter 7 1. Random Variables

μ: ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics

Math 227 Elementary Statistics. Bluman 5 th edition

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.

The Binomial and Geometric Distributions. Chapter 8

We will use an example which will result in a paired t test regarding the labor force participation rate for women in the 60 s and 70 s.

Lean Six Sigma: Training/Certification Books and Resources

Introduction to Statistics I

Stat 333 Lab Assignment #2

Chapter 5. Sampling Distributions

Sampling Distributions For Counts and Proportions

Overview. Definitions. Definitions. Graphs. Chapter 5 Probability Distributions. probability distributions

6.5: THE NORMAL APPROXIMATION TO THE BINOMIAL AND

MAKING SENSE OF DATA Essentials series

Chapter 9: Sampling Distributions

Chapter 4 Probability Distributions

Midterm Exam III Review

Exam 2 Spring 2015 Statistics for Applications 4/9/2015

Probability Notes: Binomial Probabilities

ECO220Y Continuous Probability Distributions: Normal Readings: Chapter 9, section 9.10

8.1 Estimation of the Mean and Proportion

The Normal Distribution

A Derivation of the Normal Distribution. Robert S. Wilson PhD.

Normal Approximation to Binomial Distributions

The Binomial Distribution

INF FALL NATURAL LANGUAGE PROCESSING. Jan Tore Lønning, Lecture 3, 1.9

4 Random Variables and Distributions

4.3 Normal distribution

AMS7: WEEK 4. CLASS 3

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals

Normal Distribution. Notes. Normal Distribution. Standard Normal. Sums of Normal Random Variables. Normal. approximation of Binomial.

3. Probability Distributions and Sampling

HOMEWORK: Due Mon 11/8, Chapter 9: #15, 25, 37, 44

Unit 5: Sampling Distributions of Statistics

Unit 5: Sampling Distributions of Statistics

continuous rv Note for a legitimate pdf, we have f (x) 0 and f (x)dx = 1. For a continuous rv, P(X = c) = c f (x)dx = 0, hence

CH 5 Normal Probability Distributions Properties of the Normal Distribution

Section Distributions of Random Variables

8.2 The Standard Deviation as a Ruler Chapter 8 The Normal and Other Continuous Distributions 8-1

MATH 104 CHAPTER 5 page 1 NORMAL DISTRIBUTION

Probability is the tool used for anticipating what the distribution of data should look like under a given model.

The Assumptions of Bernoulli Trials. 1. Each trial results in one of two possible outcomes, denoted success (S) or failure (F ).

Examples of continuous probability distributions: The normal and standard normal

Spike Statistics: A Tutorial

Chapter 4 Random Variables & Probability. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

Normal distribution Approximating binomial distribution by normal 2.10 Central Limit Theorem

One Proportion Superiority by a Margin Tests

Lecture 12. Some Useful Continuous Distributions. The most important continuous probability distribution in entire field of statistics.

Statistical Methods in Practice STAT/MATH 3379

Overview. Definitions. Definitions. Graphs. Chapter 4 Probability Distributions. probability distributions

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Chapter 8. Variables. Copyright 2004 Brooks/Cole, a division of Thomson Learning, Inc.

Confidence Intervals for Large Sample Proportions

Topic 6 - Continuous Distributions I. Discrete RVs. Probability Density. Continuous RVs. Background Reading. Recall the discrete distributions

ECON 214 Elements of Statistics for Economists

CD Appendix F Hypergeometric Distribution

MA131 Lecture 9.1. = µ = 25 and σ X P ( 90 < X < 100 ) = = /// σ X

Lecture 8. The Binomial Distribution. Binomial Distribution. Binomial Distribution. Probability Distributions: Normal and Binomial

Business Statistics. Chapter 5 Discrete Probability Distributions QMIS 120. Dr. Mohammad Zainal

The Binomial Distribution

Statistical Intervals. Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

A useful modeling tricks.

Week 7. Texas A& M University. Department of Mathematics Texas A& M University, College Station Section 3.2, 3.3 and 3.4

Transcription:

STAT 600 Design of Experiments for Research Workers Lab 5 { Due Thursday, November 18 Example Weight Loss In a dietary study, 14 of 0 subjects lost weight. If weight is assumed to uctuate up or down by chance, then the probability of losing weight would be p =1=. 1. Test whether the diet was eective in the sense that it resulted in more people losing weight than would have occurred by chance alone. Answer: We are interested in testing H 0 : p = p 0 versus H A : p>p 0 where p 0 = :5. The normal approximation to the binomial can be used to test this hypothesis using a z test. However, before considering the z test, let's think about how this test can be done exactly, based on the binomial distribution. Let X =the number of people who lose weight, out of n = 0. We observed X =14for a sample proportion of ^p = x=n =14=0 = :7. As usual, we decide whether the population parameter p is equal to the null value p 0 = :5 by looking at how far our estimate of the population parameter ^p = :7 is from the null value of.5. Equivalently, we can look at how far is from n^p = 0(:7) = 14 = X np 0 =0(:5) = 10 = E(XjH 0 : p = p 0 is true): So, to decide whether to reject H 0,we look at the strength of the evidence against H 0 (the p-value) provided by the fact that ^p = :7 is greater then p 0 = :5, or, equivalently, provided by the fact that we observed X = 14 successes (people losing weight) when we only expected 10 under H 0. That is, the p-value for our test is the probability of getting X = 14 successes under the null hypothesis that X Bin(n p 0 )=Bin(0 :5). We know how to calculate such a probability: 0 0 p = P (X 14) = 1 ; P (X < 14) = 1 ; :5 0 (1 ; :5) 0;0 + :5 1 (1 ; :5) 0;1 0 1 0 + + :5 13 (1 ; :5) 0;13 13 =1; :943 = :0577 1

Since p = :0577 > = :05, we would not reject H 0. There is marginal evidence that the diet is eective, but the result does not quite reach signicance. The value.943 in the last calculation, the cumulative probability of getting 13 or fewer successes from a Bin(0 :5) distribution, can be obtained from Minitab. Select Calc! Probability Distributions! Binomial... and then click \Cumulative probability", set \Number of trials:" to 0, \Probability of success:" to.5, and \Input constant:" to 13". Then click OK. Now consider how we would use the normal approximation to estimate this p-value. Recall that the normal distribution with the same mean and variance as the binomial we want to approximate is used. That is, for large enough sample size, the Bin(n p) distribution is well approximated by the N(np np(1 ; p)) distribution. That is, X Bin(0 :5) has about the same distribution as Y where Y N(0(:5) 0(:5)(1 ; :5)) = N(10 5). Therefore, our p-value is still p = P (X 14) where X Bin(n p 0 )=Bin(0 :5), but we approximate this p-value as p = P (X 14) P (Y 14) where Y N(np 0 np 0 (1 ; p 0 )) = N(10 5)! Y ; np 0 = P p npo (1 ; p 0 ) 14 ; np 0 p np0 (1 ; p 0 ) 14 ; 10 = P Z p where Z N(0 1) 5 = P (Z 1:7889) = P (Z ;1:7889) = :0368 This last probability can be obtained in Minitab by selecting Calc! Probability Distributions! Normal... and then click \Cumulative probability", set \Mean:" to 0, \Standard deviation:" to 1, and \Input constant:" to -1.7889". Then click OK. Note that this normal-approximation-based p-value is slightly in error (the exactly correct value was p = :0577, computed above). In fact, we go from a non-signicant result to a signicant result as a result of the error in approximation.

Also note that the normal-approximation-based p-value can also be obtained more directly in Minitab. Here are the steps: select Stat! Basic Statistics! 1 Proportion... and then click \Summarized data", set \Number of trials:" to 0, and \Number of events:" to 14. Then click \Options...", set \Condence level:" to 95.0, \Test proportion" to.5 (this is the value of p 0 ), \Alternative:" to \greater than", and place acheck next to \Use test and interval based on normal distribution". Then click OK twice, and you'll get the p-value we just obtained: p = :037. The normal approximation to the binomial can be improved by using what is known as a continuity correction. This correction adjusts for the fact that we are approximating the binomial, a discrete distribution, with the normal, a continuous distribution. To understand the continuity correction, recall that the normal p.d.f. doesn't give the probability of observing any single value (that probability is 0,for a continuous distribution like the normal). Instead it gives the probability associated with a range of values. So, for instance, to estimate the probability of getting exactly X = 14 successes from a Bin(n =0 p 0 = :5) distribution, we would not use P (X = 14) P (Y = 14) where Y N(np 0 np 0 (1 ; p 0 )) = N(10 5) because P (Y = 14) = 0. BelowistheBin(n =0 p 0 = :5) probability function with the N(np 0 =10 np 0 (1;p 0 )= 5) p.d.f. superimposed on top. Bin(0,.5) and N(10,5) distributions probability/probability density 0.0 0.05 0.10 0.15 0 5 10 15 0 x 3

In this plot, the vertical lines are at 0 1 ::: 0, the only possible values that X can take on. Each line has height equal to the probability of that value according to the Bin(0 :5) distribution. The smooth bell-curve is the N(10 5) distribution. Obviously, this distribution follows the binomial probabilities closely. The best normal approximation to P (X = 14), say, is not P (Y = 14) but instead P (14 ; 1 Y 14 + 1 )=P(13:5 Y 14:5) Similarly, if we want to approximate our p value, which was given by P (X 14), with the normal distribution, it is best to use P (X 14) = P (X = 14) + P (X = 15) + P (13:5 Y 14:5) + P (14:5 Y 15:5) + = P (Y 13:5) = P (Z 13:5 ; np 0 13:5 ; 10 p )=P(Z p ) np0 (1 ; p 0 ) 5 = P (Z 1:565) = P (Z ;1:565) = :0588 The last calculation above was done in Minitab. Notice, that with this continuity correction, our approximate normal-based p-value of.0588 is much closer to the true value of p = :0577 we calculated directly from the binomial distribution. Here the continuity correction for P (X x) involved subtracting 1 from x. That is, we used P (X x) P (Y x ; 1 ). Note that if we had wanted P (X x) we would have added 1. That is, we would have used P (X x) P (X x + 1). Continuity corrections generally improve the normal approximation to the binomial. However, their eect becomes negligible as the sample size n grows. Minitab implements the normal approximation without the continuity correction. Finally, note that for two-sided alternatives, the p-value is twice the one-sided p-value (unless this value turns out to be 1, in which case the p-value is rounded down to 1), using either the normal approximation or the exact binomial approach. 4

. Now form a 95% condence interval for p, the probability oflosing weight on the diet. Answer: In the case of a condence interval or one-sided condence limits, it is also possible to get an exact answer using the binomial distribution. However, exactly how this is done is somewhat complicated, so we will show how to get the exact answer with Minitab and not discuss the computational details at all. We will discuss the normal approximation approach. An approximate (normal-based) 100(1 ; )% CI for p is given by ^p z 1;=p^p(1 ; ^p)=n In our case, we want a 95% interval, so = :05 and z 1;= = z :975 =1:96. In addition, ^p = :7 so our 95% CI is :7 1:96 p :7(1 ; :7)=0 = (:499 :901) We can obtain this result using Minitab through the following steps: select Stat! Basic Statistics! 1 Proportion... and then click \Summarized data", set \Number of trials:" to 0, and \Number of events:" to 14. Then click \Options...", set \Condence level:" to 95.0, \Test proportion" to.5 (this is not necessary for a condence interval), \Alternative:" to \not equal" (for a two-sided condence interval, rather than a one-sided condence bound), and place a check next to \Use test and interval based on normal distribution". Then click OK twice, and you'll get the CI we just obtained: (:499 :901). To get the exact answer, just repeat the previous steps, but do not place a check next to \Use test and interval based on normal distribution". The resulting exact 95% CI is (:457 :881). Note that it is possible to improve on the approximate normal-based CI we just computed by using a continuity correction. With a continuity correction, a (slightly better) 100(1; )% CI for p is given by 1 ^p z 1;= p^p(1 ; ^p)=n + n In our problem the continuity-corrected normal-based 95% interval is :7 1:96 p :7(:3)=0 + 1 =(:474 :96) (0) This turns out to not be much better than the non-continuity corrected interval in this problem. Again, Minitab does not implement the continuity correction. 5

Exercise: 3. Ounsted (1953) presents data about cases with convulsive disorders. Among the cases there were 8 females and 118 males. a. Compute the p value for the test of the hypothesis that a case is equally likely to be of either sex using exact methods. b. Compute the p value for the test of the hypothesis that a case is equally likely to be of either sex using the normal approximation to the binomial without a continuity correction. c. Compute the p value for the test of the hypothesis that a case is equally likely to be of either sex using the normal approximation to the binomial with a continuity correction. d. Compare your answers in parts a, b, c. e. Obtain a 99% CI for p, the population proportion of convulsive cases that are male using exact methods. f. Obtain a 99% CI for p, the population proportion of convulsive cases that are male using the normal approximation without a continuity correction. g. Obtain a 99% CI for p, the population proportion of convulsive cases that are male using the normal approximation with a continuity correction. h. Compare your answers in parts e, f, g. For exams in this course, I will not expect you to know how to implement the continuity correction, but at a minimum, I want you to have seen it, and to know that it exists. 6