Confidence Intervals for Large Sample Proportions

Similar documents
Key Objectives. Module 2: The Logic of Statistical Inference. Z-scores. SGSB Workshop: Using Statistical Data to Make Decisions

19. CONFIDENCE INTERVALS FOR THE MEAN; KNOWN VARIANCE

Statistics Class 15 3/21/2012

Chapter 6.1 Confidence Intervals. Stat 226 Introduction to Business Statistics I. Chapter 6, Section 6.1

Chapter 8 Statistical Intervals for a Single Sample

Statistics 13 Elementary Statistics

AMS7: WEEK 4. CLASS 3

1 Inferential Statistic

Chapter 7 presents the beginning of inferential statistics. The two major activities of inferential statistics are

Homework: (Due Wed) Chapter 10: #5, 22, 42

Part 1 In which we meet the law of averages. The Law of Averages. The Expected Value & The Standard Error. Where Are We Going?

. 13. The maximum error (margin of error) of the estimate for μ (based on known σ) is:

STAT Chapter 7: Confidence Intervals

Chapter 7. Confidence Intervals and Sample Sizes. Definition. Definition. Definition. Definition. Confidence Interval : CI. Point Estimate.

Normal Table Gymnastics

Statistical Methods in Practice STAT/MATH 3379

Data Analysis and Statistical Methods Statistics 651

Lecture 9 - Sampling Distributions and the CLT. Mean. Margin of error. Sta102/BME102. February 6, Sample mean ( X ): x i

Statistical Intervals (One sample) (Chs )

In a binomial experiment of n trials, where p = probability of success and q = probability of failure. mean variance standard deviation

Estimating parameters 5.3 Confidence Intervals 5.4 Sample Variance

The Binomial Distribution

The Binomial Distribution

Chapter 8 Estimation

Lecture 9 - Sampling Distributions and the CLT

Confidence Intervals and Sample Size

Probability. An intro for calculus students P= Figure 1: A normal integral

Sampling Distributions and the Central Limit Theorem

Sampling Distributions

What is the probability of success? Failure? How could we do this simulation using a random number table?

Statistical Intervals. Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Chapter 5. Sampling Distributions

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION

Chapter 7. Sampling Distributions

Math 227 Elementary Statistics. Bluman 5 th edition

chapter 13: Binomial Distribution Exercises (binomial)13.6, 13.12, 13.22, 13.43

Homework: Due Wed, Feb 20 th. Chapter 8, # 60a + 62a (count together as 1), 74, 82

If X = the different scores you could get on the quiz, what values could X be?

FEEG6017 lecture: The normal distribution, estimation, confidence intervals. Markus Brede,

Name PID Section # (enrolled)

The Central Limit Theorem. Sec. 8.2: The Random Variable. it s Distribution. it s Distribution

Data Analysis and Statistical Methods Statistics 651

Review. Preview This chapter presents the beginning of inferential statistics. October 25, S7.1 2_3 Estimating a Population Proportion

Chapter 6: Random Variables. Ch. 6-3: Binomial and Geometric Random Variables

STAT 201 Chapter 6. Distribution

A useful modeling tricks.

MAKING SENSE OF DATA Essentials series

1. Variability in estimates and CLT

Chapter 6 Confidence Intervals Section 6-1 Confidence Intervals for the Mean (Large Samples) Estimating Population Parameters

Homework: Due Wed, Nov 3 rd Chapter 8, # 48a, 55c and 56 (count as 1), 67a

Chapter 6 Confidence Intervals

1. Three draws are made at random from the box [ 3, 4, 4, 5, 5, 5 ].

Chapter 7 Notes. Random Variables and Probability Distributions

Final Exam Practice Set, STT 315, Section 106

Lecture 9: Plinko Probabilities, Part III Random Variables, Expected Values and Variances

MA 1125 Lecture 12 - Mean and Standard Deviation for the Binomial Distribution. Objectives: Mean and standard deviation for the binomial distribution.

Section 7.2. Estimating a Population Proportion

Statistics and Probability

Multiple-Choice Questions

MATH1215: Mathematical Thinking Sec. 08 Spring Worksheet 9: Solution. x P(x)

Stats SB Notes 6.3 Completed.notebook April 03, Mar 23 5:22 PM. Chapter Outline. 6.1 Confidence Intervals for the Mean (σ Known)

Chapter Four: Introduction To Inference 1/50

2.) What is the set of outcomes that describes the event that at least one of the items selected is defective? {AD, DA, DD}

Statistics Chapter 8

Binomial Random Variable - The count X of successes in a binomial setting

MATH 118 Class Notes For Chapter 5 By: Maan Omran

Bin(20,.5) and N(10,5) distributions

The Normal Approximation to the Binomial

Class 16. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

CHAPTER 6 Random Variables

8.1 Estimation of the Mean and Proportion

MATH 264 Problem Homework I

Chapter 10 Estimating Proportions with Confidence

3. The n observations are independent. Knowing the result of one observation tells you nothing about the other observations.

CHAPTER 6 Random Variables

Chapter 9: Sampling Distributions

Data Analysis. BCF106 Fundamentals of Cost Analysis

ECON 214 Elements of Statistics for Economists 2016/2017

Chapter 4 and 5 Note Guide: Probability Distributions

Math 251, Test 2 Wednesday, May 19, 2004

CHAPTER 6 Random Variables

Contents. The Binomial Distribution. The Binomial Distribution The Normal Approximation to the Binomial Left hander example

The Binomial Distribution

guessing Bluman, Chapter 5 2

Sociology 301. Sampling Distribution and Central Limit Theory. Sampling Distribution. Inferential Statistics. We want to draw conclusions about

A probability distribution shows the possible outcomes of an experiment and the probability of each of these outcomes.

Part V - Chance Variability

Confidence Intervals Introduction

Lecture 16: Estimating Parameters (Confidence Interval Estimates of the Mean)

3. The n observations are independent. Knowing the result of one observation tells you nothing about the other observations.

***SECTION 8.1*** The Binomial Distributions

Probability & Statistics Chapter 5: Binomial Distribution

Elementary Statistics Triola, Elementary Statistics 11/e Unit 14 The Confidence Interval for Means, σ Unknown

Chapter 6: Random Variables

Chapter 23: accuracy of averages

The following content is provided under a Creative Commons license. Your support

Statistics 6 th Edition

Random Variables CHAPTER 6.3 BINOMIAL AND GEOMETRIC RANDOM VARIABLES

ECON 214 Elements of Statistics for Economists

MA131 Lecture 9.1. = µ = 25 and σ X P ( 90 < X < 100 ) = = /// σ X

Transcription:

Confidence Intervals for Large Sample Proportions Dr Tom Ilvento Department of Food and Resource Economics Overview Confidence Intervals C.I. We will start with large sample C.I. for proportions, using the normal approximation of the binomial Binomial looks like a normal if n is large and p or q is not too extreme New terms: Bound of Error Confidence Coefficient Alpha (!) Goal is gain some sense of what a C.I. is saying 2 The Pepsi Challenge The Pepsi Challenge asked soda drinkers to compare Diet Coke and Diet Pepsi in a blind taste test. Pepsi claimed that more than! of Diet Coke drinkers said they preferred Diet Pepsi Suppose we take a random sample of 100 Diet Coke Drinkers and we found that 56 preferred Diet Pepsi. p = 56/100 =.56 q = (1-.56) =.44 n is large (100) and p or q is not small We can use the normal approximation Remember, just because I see something, doesn t mean it is so! 3 Proportions, p p = sample proportion an P is the population value (some books use ") If x represents the number of successes in our sample, then our estimator of P (population parameter) from a sample is p = x/n The variance of a proportion is given by s 2 = pq Where q = 1- p s = (pq).5 Note: we will think there is a population proportion, P, with variance equal to # 2 4

Standard Error for a Proportion, p The Standard Error of the Sampling Distribution of a proportion is SE for p = (PQ/n).5 Note: " 2 = PQ, and! = 2! n n If we don t know P and Q, we use the sample estimates, p and q 5 Steps in Calculating a Confidence Interval 1. Note the sample size: n = 100 2. Calculate p and q p = 56/100 =.56 q = 1 -.56 =.44 3. Calculate the Variance and Standard Deviation s 2 = pq = (.56)(.44) =.2464 s =.4964 4. Calculate the Standard Error SEp =.4964/(100).5 =.0496 SEp = (.2464/100).5 =.0496 an alternative way 6 Confidence Interval Confidence Interval The sample provides an estimate Point Estimate, a single value computed from a sample and used to estimate the value of the target population. The sample proportion and standard deviation are point estimates of population proportion P and population standard deviation " respectively. I would like to place a Bound of Error around the estimate Confidence Interval or an Interval Estimate 7 I need to think of my sample as one of many possible samples I know from our work on the Normal curve that a z- value of ± 1.96 corresponds to 95 percent of the values in a normal distribtion A z-value of 1.96 is associated with a probability of.475 on one side of the normal curve 2 times that value yields 95% of the area under the normal curve, centered around the middle of the (the mean) 8

Confidence Interval for the Pepsi Challenge If I think of my sample as part of the sampling I can place a 1.96(standard error) around my estimate Like this for a 95% C.I.:.56 ± 1.96(.0496).56 ±.097.463 to.657 Notice that values less than.5 are in this interval - the population value P could be less than.5 9 Why did I use the Standard Error in the formula? I am asking the question about the proportion of Diet Coke drinkers who prefer Pepsi I want some sense of how well my sample estimates the population If my sample is drawn randomly, it will represent the population, plus some sampling error A 95% Confidence Interval means that If I would have taken all possible samples And calculated a confidence interval for the proportion for each one 95% of them would have contained the true population parameter 10 What is a Confidence Interval? What is a Confidence Interval? It is an interval estimate of a population parameter The plus or minus part is also known as a Bound of Error (BOE) or Margin of Error (MOE) Placed in a probability framework Like this for a 95% C.I.:.56 ± 1.96(.0496).56 ±.097.463 to.657 11 We calculate the probability that the estimation process will result in an interval that contains the true value of the population proportion or mean If we had repeated samples Most of the C.I.s would contain the population parameter But not all of them will!!!! 12

Think of this like the Jart game (only backwards) Jarts was a backyard game in the 1970 s and 80 s (aks, Lawn Darts) You placed a ring on the ground, and tried to throw a giant dart into the ring, somewhat like horseshoes The darts were sharp and some people got hurt! But let s rethink this game - throw rings around a fixed Jart The Jart is the population parameter and the rings are confidence intervals Some rings will miss, but most will capture it 13 To construct a Confidence Interval, we need A point estimator A sample and a sample estimate using the estimator Knowledge of the Sampling Distribution of the point estimator The Standard Error of the estimator The form of the sampling A probability level we are comfortable with how much certainty. It s also called Confidence Coefficient Estimator of P is, p= x/n p from a sample of n observations The sampling is known with mean = P SEp = (PQ/n).5 Normal approximation of binomial Most times we will use either a.90,.95 or a.99 Confidence Coefficient A level of Error $, which is the chance of 14 being wrong Confidence Interval for a Proportion Confidence Interval Formula for C.I. for a Proportion p We are using the Normal Approximation to the Binomial Distribution And the sample estimates of p and q Assumption: A sufficiently large random sample of size n is selected from the population. p ± Z " 2 p (1# p ) n 15 Z$/2 refers to the z-score associated with a particular probability level divided by 2 $ refers to the area in the tails of the We divide by 2 because we divide $ equally on both sides of the mean Which means $ represents the combined area, or the probability, in the tails of both sides of the normal curve The 95% part is divided evenly around the center of the and the 5% part, $, is distributed evenly in the tails p ± Z " 2 p (1# p ) n 16

Confidence Interval The larger the Confidence Coefficient or probability level for a C.I. The smaller the value of!, and!/2 The larger the z value Confidence Coefficient (1-!)*100 p ± Z " 2 p (1# p ) n!!/2 Z!/2 90% 0.10 0.05 1.645 95% 0.05 0.025 1.96 For any given sample size, the width of the Confidence Interval depends upon! For the Pepsi Challenge Example 90% C.I..56 ± 1.645(.0496) =.56 ±.0816 95% C.I..56 ± 1.96(.0496) =.56 ±.0972 99% C.I..56 ± 2.575(.0496) =.56 ±.1277 For any given sample size, if you want to be more certain (smaller!) you have to accept a wider interval 99% 0.01 0.005 2.575 17 18 A problem for you to try Survey questionnaire for who citizens would vote for in a state election 1,052 adults selected randomly were surveyed by a major newspaper The percentage who indicated Candidate B was 35% Construct a 95% C.I. for this proportion 19 The facts p =.35 q = (1 -.35) =.65 n = 1,052 Standard Error = C.I..35 ± 1.96(.0147).35 ±.0288 The Solution s p =.35!.65 =.0147 1052.3212 to.3788 20

Newspaper MOE Variance is largest at p=.5 The newspaper said there is a 3.0% Margin Of Error. Where did this figure come from? It doesn t match our previous figure of 2.88% And what does MOE mean? They calculated a general C.I. For a proportion at.5 Standard Error = [(.5*.5)/1,052].5 =.0154 C.I..5 ± 1.96(.0154).5 ±.0302 or 3% For a proportion, the variance is largest at.5, or an equal split At.5 s 2 = (.5)(.5) =.25 At.7 s 2 = (.7)(.3) =.21 At.3 s 2 = (.3)(.7) =.21 Which brings up another unique thing about proportions once you specify a value of p for the population, the variance (# 2 ) is known. 21 22 Summary Confidence Intervals are a way to place a bound of Error around our estimate, in a probability framework. We need an estimator for P, p=x/n a sample estimate (p) Knowledge of the sampling (the normal ) and a standard error The level of alpha the area in the tails For confidence Intervals for proportions, we use the normal approximation of the binomial as long as the sample size is sufficiently large and p (or q) is not too small. 23