STA258H5. Al Nosedal and Alison Weir. Winter Al Nosedal and Alison Weir STA258H5 Winter / 41

Similar documents
Elementary Statistics Lecture 5

Chapter 7. Sampling Distributions and the Central Limit Theorem

Chapter 7. Sampling Distributions and the Central Limit Theorem

Discrete Random Variables

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

Binomial Random Variables. Binomial Random Variables

Binomial Distributions

Engineering Statistics ECIV 2305

Section 7.5 The Normal Distribution. Section 7.6 Application of the Normal Distribution

Probability Distributions for Discrete RV

The binomial distribution p314

Probability Theory and Simulation Methods. April 9th, Lecture 20: Special distributions

4 Random Variables and Distributions

Lecture 23. STAT 225 Introduction to Probability Models April 4, Whitney Huang Purdue University. Normal approximation to Binomial

Chapter 5. Sampling Distributions

ECO220Y Continuous Probability Distributions: Normal Readings: Chapter 9, section 9.10

Bernoulli and Binomial Distributions

(# of die rolls that satisfy the criteria) (# of possible die rolls)

Probability Theory. Mohamed I. Riffi. Islamic University of Gaza

Math 14 Lecture Notes Ch The Normal Approximation to the Binomial Distribution. P (X ) = nc X p X q n X =

Sampling Distributions

MA : Introductory Probability

STOR Lecture 7. Random Variables - I

Math Week in Review #10. Experiments with two outcomes ( success and failure ) are called Bernoulli or binomial trials.

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

Expectations. Definition Let X be a discrete rv with set of possible values D and pmf p(x). The expected value or mean value of X, denoted by E(X ) or

Lecture 8. The Binomial Distribution. Binomial Distribution. Binomial Distribution. Probability Distributions: Normal and Binomial

MidTerm 1) Find the following (round off to one decimal place):

The Bernoulli distribution

5.4 Normal Approximation of the Binomial Distribution

CH 5 Normal Probability Distributions Properties of the Normal Distribution

ECO220Y Sampling Distributions of Sample Statistics: Sample Proportion Readings: Chapter 10, section

Central Limit Theorem, Joint Distributions Spring 2018

For more information about how to cite these materials visit

Using the Central Limit Theorem It is important for you to understand when to use the CLT. If you are being asked to find the probability of the

AMS 7 Sampling Distributions, Central limit theorem, Confidence Intervals Lecture 4

AMS7: WEEK 4. CLASS 3

CIVL Discrete Distributions

Standard Normal, Inverse Normal and Sampling Distributions

Central Limit Theorem (cont d) 7/28/2006

continuous rv Note for a legitimate pdf, we have f (x) 0 and f (x)dx = 1. For a continuous rv, P(X = c) = c f (x)dx = 0, hence

Section The Sampling Distribution of a Sample Mean

Binomial and Geometric Distributions

Lecture 8 - Sampling Distributions and the CLT

Unit2: Probabilityanddistributions. 3. Normal and binomial distributions

Review of the Topics for Midterm I

8.1 Binomial Distributions

E509A: Principle of Biostatistics. GY Zou

The Central Limit Theorem. Sec. 8.2: The Random Variable. it s Distribution. it s Distribution

Normal distribution Approximating binomial distribution by normal 2.10 Central Limit Theorem

Unit2: Probabilityanddistributions. 3. Normal and binomial distributions

Normal Distribution. Notes. Normal Distribution. Standard Normal. Sums of Normal Random Variables. Normal. approximation of Binomial.

Binomial Distributions

Sampling. Marc H. Mehlman University of New Haven. Marc Mehlman (University of New Haven) Sampling 1 / 20.

Section Introduction to Normal Distributions

Review for Final Exam Spring 2014 Jeremy Orloff and Jonathan Bloom

STAT Chapter 7: Central Limit Theorem

STA215 Confidence Intervals for Proportions

Binomial Distribution. Normal Approximation to the Binomial

4.3 Normal distribution

Nicole Dalzell. July 7, 2014

18.05 Problem Set 3, Spring 2014 Solutions

Tutorial 11: Limit Theorems. Baoxiang Wang & Yihan Zhang bxwang, April 10, 2017

5.3 Statistics and Their Distributions

Sampling and sampling distribution

4.2 Bernoulli Trials and Binomial Distributions

Probability Models. Grab a copy of the notes on the table by the door

Statistics and Probability

Chapter 3 Discrete Random Variables and Probability Distributions

Lesson 97 - Binomial Distributions IBHL2 - SANTOWSKI

Opening Exercise: Lesson 91 - Binomial Distributions IBHL2 - SANTOWSKI

Binomal and Geometric Distributions

MA 1125 Lecture 12 - Mean and Standard Deviation for the Binomial Distribution. Objectives: Mean and standard deviation for the binomial distribution.

Consider the following examples: ex: let X = tossing a coin three times and counting the number of heads

Chapter 8: The Binomial and Geometric Distributions

Central limit theorems

Chapter 3 Discrete Random Variables and Probability Distributions

Chapter 8. Binomial and Geometric Distributions

Discrete Random Variables and Probability Distributions. Stat 4570/5570 Based on Devore s book (Ed 8)

1 PMF and CDF Random Variable PMF and CDF... 4

Counting Basics. Venn diagrams

UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions.

Part 10: The Binomial Distribution

Some Discrete Distribution Families

Midterm Exam III Review

INF FALL NATURAL LANGUAGE PROCESSING. Jan Tore Lønning, Lecture 3, 1.9

Chapter 8 Solutions Page 1 of 15 CHAPTER 8 EXERCISE SOLUTIONS

Chapter 6: Random Variables and Probability Distributions

Business Statistics. Chapter 5 Discrete Probability Distributions QMIS 120. Dr. Mohammad Zainal

Objective: To understand similarities and differences between geometric and binomial scenarios and to solve problems related to these scenarios.

Chapter 7: Point Estimation and Sampling Distributions

Chapter 5 Student Lecture Notes 5-1. Department of Quantitative Methods & Information Systems. Business Statistics

Solutions for practice questions: Chapter 15, Probability Distributions If you find any errors, please let me know at

DO NOT POST THESE ANSWERS ONLINE BFW Publishers 2014

STA 6166 Fall 2007 Web-based Course. Notes 10: Probability Models

STOR 155 Introductory Statistics (Chap 5) Lecture 14: Sampling Distributions for Counts and Proportions

BIO5312 Biostatistics Lecture 5: Estimations

Business Statistics 41000: Probability 4

Statistics, Their Distributions, and the Central Limit Theorem

The Binomial Distribution

Transcription:

STA258H5 Al Nosedal and Alison Weir Winter 2017 Al Nosedal and Alison Weir STA258H5 Winter 2017 1 / 41

NORMAL APPROXIMATION TO THE BINOMIAL DISTRIBUTION. Al Nosedal and Alison Weir STA258H5 Winter 2017 2 / 41

Discrete Uniform Distribution A random variable X has a discrete uniform distribution if each of the n values in its range, say, x 1, x 2,..., x n has equal probability. Then, f (x i ) = 1 n Al Nosedal and Alison Weir STA258H5 Winter 2017 3 / 41

Probability mass function (pmf) y 0.10 0.16 0.22 1 2 3 4 5 6 x Al Nosedal and Alison Weir STA258H5 Winter 2017 4 / 41

Random Variable A random variable is a variable whose value is a numerical outcome of a random phenomenon. The probability distribution of a random variable X tells us what values X can take and how to assign probabilities to those values. Al Nosedal and Alison Weir STA258H5 Winter 2017 5 / 41

The Binomial setting There are a fixed number n of observations. The n observations are all independent. That is, knowing the result of one observation tells you nothing about the other obsevations. Each observation falls into one of just two categories, which for convenience we call success and failure. The probability of a success, call it p, is the same for each observation. Al Nosedal and Alison Weir STA258H5 Winter 2017 6 / 41

Example Think of rolling a die n times as an example of the binomial setting. Each roll gives either a six or a number different from six. Knowing the outcome of one roll doesn t tell us anything about other rolls, so the n rolls are independent. If we call six a success, then p is the probability of a six and remains the same as long as we roll the same die. The number of sixes we count is a random variable X. The distribution of X is called a binomial distribution. Al Nosedal and Alison Weir STA258H5 Winter 2017 7 / 41

Binomial Distribution A random variable Y is said to have a binomial distribution based on n trials with success probability p if and only if p(y) = n! y!(n y)! py (1 p) n y, y = 0, 1, 2,..., n and 0 p 1. E(Y ) = np and V (Y ) = np(1 p). Al Nosedal and Alison Weir STA258H5 Winter 2017 8 / 41

Probability mass function when n=10 and p=1/6 ## Pmf of Binomial with n=10 and p=1/6. x<-seq(0,10,by=1); y<-dbinom(x,10,1/6); plot(x,y,type="p",col="blue",pch=19); Al Nosedal and Alison Weir STA258H5 Winter 2017 9 / 41

PMF when n=10 and p=1/6 y 0.00 0.10 0.20 0.30 0 2 4 6 8 10 x Al Nosedal and Alison Weir STA258H5 Winter 2017 10 / 41

PMF when n=20 and p=1/6 y 0.00 0.10 0.20 0 5 10 15 20 x Al Nosedal and Alison Weir STA258H5 Winter 2017 11 / 41

PMF when n=30 and p=1/6 y 0.00 0.10 0 5 10 15 20 25 30 x Al Nosedal and Alison Weir STA258H5 Winter 2017 12 / 41

PMF when n=40 and p=1/6 y 0.00 0.10 0 10 20 30 40 x Al Nosedal and Alison Weir STA258H5 Winter 2017 13 / 41

PMF when n=50 and p=1/6 y 0.00 0.05 0.10 0.15 0 10 20 30 40 50 x Al Nosedal and Alison Weir STA258H5 Winter 2017 14 / 41

PMF when n=60 and p=1/6 y 0.00 0.06 0.12 0 10 20 30 40 50 60 x Al Nosedal and Alison Weir STA258H5 Winter 2017 15 / 41

PMF when n=70 and p=1/6 y 0.00 0.04 0.08 0.12 0 10 20 30 40 50 60 70 x Al Nosedal and Alison Weir STA258H5 Winter 2017 16 / 41

PMF when n=80 and p=1/6 y 0.00 0.04 0.08 0.12 0 20 40 60 80 x Al Nosedal and Alison Weir STA258H5 Winter 2017 17 / 41

PMF when n=90 and p=1/6 y 0.00 0.04 0.08 0 20 40 60 80 x Al Nosedal and Alison Weir STA258H5 Winter 2017 18 / 41

PMF when n=100 and p=1/6 y 0.00 0.04 0.08 0 20 40 60 80 100 x Al Nosedal and Alison Weir STA258H5 Winter 2017 19 / 41

PMF when n=200 and p=1/6 y 0.00 0.04 0 50 100 150 200 x Al Nosedal and Alison Weir STA258H5 Winter 2017 20 / 41

PMF when n=300 and p=1/6 y 0.00 0.02 0.04 0.06 0 50 100 150 200 250 300 x Al Nosedal and Alison Weir STA258H5 Winter 2017 21 / 41

Sampling Distribution of a sample proportion Draw an Simple Random Sample (SRS) of size n from a large population that contains proportion p of successes. Let ˆp be the sample proportion of successes, Then: ˆp = number of successes in the sample n The mean of the sampling distribution of ˆp is p. The standard deviation of the sampling distribution is p(1 p). n Al Nosedal and Alison Weir STA258H5 Winter 2017 22 / 41

Sampling Distribution of a sample proportion Draw an SRS of size n from a large population that contains proportion p of successes. Let ˆp be the sample proportion of successes, Then: ˆp = number of successes in the sample n As the sample size increases, the sampling distribution of ˆp becomes approximately ( ) Normal. That is, for large n, ˆp has approximately the N p, distribution. p(1 p) n Al Nosedal and Alison Weir STA258H5 Winter 2017 23 / 41

Binomial with Normal Approximation 0.00 0.02 0.04 0.06 Normal approximation 0 50 100 150 200 250 Al Nosedal and Alison Weir STA258H5 Winter 2017 24 / 41

Bernoulli Distribution (Binomial with n = 1) x i = { 1 i-th roll is a six 0 otherwise µ = E(x i ) = p σ 2 = V (x i ) = p(1 p) Let ˆp be our estimate of p. Note that ˆp = the Central Limit Theorem, we know that: σ x is roughly N(µ, n ), that is, ( ) ˆp is roughly N p, p(1 p) n n i=1 x i n = x. If n is large, by Al Nosedal and Alison Weir STA258H5 Winter 2017 25 / 41

Example In the last election, a state representative received 52% of the votes cast. One year after the election, the representative organized a survey that asked a random sample of 300 people whether they would vote for him in the next election. If we assume that his popularity has not changed, what is the probability that more than half of the sample would vote for him? Al Nosedal and Alison Weir STA258H5 Winter 2017 26 / 41

Solution (Normal approximation) We want to determine the probability that the sample proportion is greater than 50%. In other words, we want to find P(ˆp > 0.50). We know that the sample proportion ˆp is roughly Normally distributed with mean p = 0.52 and standard deviation p(1 p)/n = (0.52)(0.48)/300 = 0.0288. Thus, we calculate( ) ˆp p P(ˆp > 0.50) = P > 0.50 0.52 p(1 p)/n 0.0288 = P(Z > 0.69) = 1 P(Z < 0.69) (Z is symmetric) = P(Z > 0.69) = 1 P(Z > 0.69) = 1 0.2451 = 0.7549. If we assume that the level of support remains at 52%, the probability that more than half the sample of 300 people would vote for the representative is 0.7549. Al Nosedal and Alison Weir STA258H5 Winter 2017 27 / 41

R code (Normal approximation) Just type in the following: 1- pnorm(0.50, mean = 0.52, sd = 0.0288); ## [1] 0.7562982 Recall that, pnorm will give you the area to the left of 0.50, for a Normal distribution with mean 0.52 and standard deviation 0.0288. Al Nosedal and Alison Weir STA258H5 Winter 2017 28 / 41

Solution (using Binomial) We want to determine the probability that the sample proportion is greater than 50%. In other words, we want to find P(ˆp > 0.50). We know that n = 300 and p = 0.52. Thus, we calculate( n ) i=1 P(ˆp > 0.50) = P x i n > 0.50 = P( 300 i=1 x i > 150) = 1 P( 300 i=1 x i 150) (it can be shown that Y = 300 i=1 x i has a Binomial distribution with n = 300 and p = 0.52). = 1 F Y (150) Al Nosedal and Alison Weir STA258H5 Winter 2017 29 / 41

R code (using Binomial distribution ) Just type in the following: 1- pbinom(150, size = 300, prob=0.52); ## [1] 0.7375949 Recall that, pbinom will give you the CDF at 150, for a Binomial distribution with n = 300 and p = 0.52. Al Nosedal and Alison Weir STA258H5 Winter 2017 30 / 41

Solution (using continuity correction) We have that n = 300 and p = 0.52. Thus, we calculate( n ) i=1 P(ˆp > 0.50) = P x i n > 0.50 = P( 300 i=1 x i > 150) = 1 P( 300 i=1 x i 150) (it can be shown that Y = 300 i=1 x i has a Binomial distribution with n = 300 and p = 0.50). 1 P( 300 i=1 x i 150.5) (continuity correction) 300 i=1 = 1 P( x i n 150.5 300 ) = 1 P(ˆp 0.5017) = 1 P(Z 0.6354) (Why?) Al Nosedal and Alison Weir STA258H5 Winter 2017 31 / 41

R code (Normal approximation with continuity correction) Just type in the following: 1- pnorm(0.5017, mean = 0.52, sd = 0.0288); ## [1] 0.7374216 Recall that, pnorm will give you the area to the left of 0.5017, for a Normal distribution with mean 0.52 and standard deviation 0.0288. Al Nosedal and Alison Weir STA258H5 Winter 2017 32 / 41

Continuity Correction Suppose that Y has a Binomial distribution with n = 20 and p = 0.4. We will find the exact probabilities that Y y and compare these to the corresponding values found by using two Normal approximations. One of them, when X is Normally distributed with µ X = np and σ X = np(1 p). The other one, W, a shifted version of X. Al Nosedal and Alison Weir STA258H5 Winter 2017 33 / 41

Continuity Correction (cont.) For example, P(Y 8) = 0.5955987 As previously stated, we can think of Y as having approximately the same distribution as X. P(Y 8) P(X 8) [ ] X np = P 8 8 np(1 p) 20(0.4)(0.6) = P(Z 0) = 0.5 Al Nosedal and Alison Weir STA258H5 Winter 2017 34 / 41

Continuity Correction (cont.) P(Y 8) P(W 8.5) [ ] W np = P 8.5 8 np(1 p) 20(0.4)(0.6) = P(Z 0.2282) = 0.5902615 Al Nosedal and Alison Weir STA258H5 Winter 2017 35 / 41

F(y) 0.0 0.2 0.4 0.6 0.8 1.0 CDF of Y CDF of X 0 5 10 15 20 Al Nosedal and Alison Weir STA258H5 Winter 2017 36 / 41

F(y) 0.0 0.2 0.4 0.6 0.8 1.0 CDF of Y CDF of W (with correction) 0 5 10 15 20 Al Nosedal and Alison Weir STA258H5 Winter 2017 37 / 41

Example Fifty-one percent of adults in the U. S. whose New Year s resolution was to exercise more achieved their resolution. You randomly select 65 adults in the U. S. whose resolution was to exercise more and ask each if he or she achieved that resolution. What is the probability that exactly forty of them respond yes? Al Nosedal and Alison Weir STA258H5 Winter 2017 38 / 41

Example Fifty-one percent of adults in the U. S. whose New Year s resolution was to exercise more achieved their resolution. You randomly select 65 adults in the U. S. whose resolution was to exercise more and ask each if he or she achieved that resolution. What is the probability that fewer than forty of them respond yes? Al Nosedal and Alison Weir STA258H5 Winter 2017 39 / 41

Normal Approximation to Binomial Let X = n i=1 Y i where Y 1, Y 2,..., Y n are iid Bernoulli random variables. Note that X = n ˆp. 1 n ˆp is approximately Normally distributed provided that np and n(1 p) are greater than 5. 2 The expected value: E(n ˆp) = np. 3 The variance: V ( np) ˆ = np(1 p) = npq. Al Nosedal and Alison Weir STA258H5 Winter 2017 40 / 41

Why bother with approximating? Calculations may be less tedious. Calculations will be made easier and quicker. Al Nosedal and Alison Weir STA258H5 Winter 2017 41 / 41