Stat 226 Introduction to Business Statistics I Spring 2009 Professor: Dr. Petrutza Caragea Section A Tuesdays and Thursdays 9:30-10:50 a.m. Chapter 6, Section 6.1 Confidence Intervals Confidence Intervals Sample means vary in value and form a sampling distribution in which not all samples result in x-values equal to the population mean µ. We should not expect to obtain a sample mean x (based on a specific sample that is exactly equal to the population mean µ. However, we can expect the point estimate to be fairly close in value to the population mean for a sufficiently large sample size (sampling distribution becomes approximately normal for large sample size. Recall 68-95-99.7 rule: 95% of all observations from a normal distribution will fall within ± 2 standard deviation. Stat 226 (Spring 2009, Section A Introduction to Business Statistics I Section 6.1 1 / 25 Stat 226 (Spring 2009, Section A Introduction to Business Statistics I Section 6.1 2 / 25 If the sample size n is large enough, the sampling distribution of the sample means is approximately normal. Our point estimate x will hardly be equal to the population mean µ, but most likely ( 95% of all times fall within 2 standard deviations about the population mean µ. This interval (µ 2 n ; µ + 2 n Using this concept, we can construct so-called confidence intervals: We know that x follows a normal distribution with mean µ and standard deviation / n, i.e., x N (µ, n Therefore, we can anticipate approximately 95% of all random samples of size n from some population with unknown µ and known to produce sample means x that fall between µ 2 n and µ + 2 n is based on the 68-95-99.7 rule. We know from Chapter 1, that the actual z-score corresponding to the middle 95% is z = 1.96. so more precisely we have (µ 1.96 n ; µ + 1.96 n We are going to use z = 1.96 in the future when constructing a 95% confidence interval. Stat 226 (Spring 2009, Section A Introduction to Business Statistics I Section 6.1 3 / 25 Stat 226 (Spring 2009, Section A Introduction to Business Statistics I Section 6.1 4 / 25
Example: ACT scores N (µ, 5.9, let s take samples of size n = 76 It can be shown that this concept can be reversed in the following sense: approximately 95% of all samples of size 76 will produce sample means between Stat 226 (Spring 2009, Section A Introduction to Business Statistics I Section 6.1 5 / 25 Stat 226 (Spring 2009, Section A Introduction to Business Statistics I Section 6.1 6 / 25 Definition of a Confidence Interval Confidence Intervals (short: CI A confidence interval for the unknown population mean µ is an interval (or range of plausible values for µ. It is constructed such that with a chosen degree (or level of confidence C, the value of the unknown population mean will be captured inside the interval. For each confidence interval we have a confidence level C: C provides information on how much confidence we can have in the method used to construct the CI C usual choices are: 90%, 95%, and 99% C can be interpreted as the rate of success for the method used to construct CI in the long run Stat 226 (Spring 2009, Section A Introduction to Business Statistics I Section 6.1 7 / 25 Stat 226 (Spring 2009, Section A Introduction to Business Statistics I Section 6.1 8 / 25
A level C confidence interval for population mean µ For a sufficiently large sample size n (CLT can apply so x follows a normal distribution or a population that is already normally distributed, the general formula for a level C confidence interval for the population mean µ when is known is given by ( x z n ; x + z i.e. in short notation ( x ± z n n The desired level of confidence C determines which critical value z is used. The three most commonly used confidence levels, 90%, 95%, and 99% use critical values 1.645, 1.96, and 2.575 respectively. Use Table A to find z. Stat 226 (Spring 2009, Section A Introduction to Business Statistics I Section 6.1 9 / 25 Example: 99% level of confidence A 99% confidence interval is constructed such that in the long run it is successful in capturing the true unknown population mean 99% of all times. Finding the critical value z for a level C confidence interval: More precisely we have that C = (1 α 100% The relevant number is called α, measuring the difference between the desired level of confidence and certainty (i.e. 100%. Example: Stat 226 (Spring 2009, Section A Introduction to Business Statistics I Section 6.1 10 / 25 Example: A random sample of size n = 25 from last semester s heights data yielded a sample mean of x = 69.36. We know the population standard deviation is = 4.004 Find a 90% confidence interval for the unknown population mean µ Stat 226 (Spring 2009, Section A Introduction to Business Statistics I Section 6.1 11 / 25 Stat 226 (Spring 2009, Section A Introduction to Business Statistics I Section 6.1 12 / 25
What about a 95% confidence interval? Why settle for a 90% CI or 95% CI when we can construct 99% CIs? The higher level of confidence comes with a price tag: The resulting interval is wider than the 90% or 95% confidence interval: 99% CI = z = 2.575 Stat 226 (Spring 2009, Section A Introduction to Business Statistics I Section 6.1 13 / 25 = 69.36 ± 2.575 4.004 25 }{{} 2.06206 = (67.29794, 71.42206 Stat 226 (Spring 2009, Section A Introduction to Business Statistics I Section 6.1 14 / 25 The width of any confidence interval is given by In the previous 3 examples, the width of the corresponding CIs was 90%: Handout on simulated confidence intervals 95%: 99%: Stat 226 (Spring 2009, Section A Introduction to Business Statistics I Section 6.1 15 / 25 Stat 226 (Spring 2009, Section A Introduction to Business Statistics I Section 6.1 16 / 25
Chapter 6.1 Interpretation of Confidence Intervals Chapter 6.1 Interpretation of Confidence Intervals Interpretation of Confidence Intervals Referring to the handout on the 100 simulated confidence intervals we can take a away the following facts: 1 We can be C% confident that the falls in the constructed level C confidence interval, i.e. between the lower and upper CI bound for a specific calculated example. Be careful: Before we take a sample from a population we can say there is a C% chance, (e.g. 95% chance, that our confidence interval will include the population parameter µ if we plan on constructing C% confidence intervals, (e.g. 95% CIs. 2 If we would take repeated samples, approximately C% of all samples taken will include the in the long run. 3 The interpretation of a CI is always in terms of the unknown population mean µ and never in terms of the sample mean x. The sample mean x, the center of every CI, will always be included in the CI by default. Stat 226 (Spring 2009, Section A Introduction to Business Statistics I Section 6.1 17 / 25 Once we have taken the sample, this decision is made. Our interval either does contain µ or it does not. We just don t know it. There is not a C% chance anymore, all we can say is that we are C% confident, (e.g. 95% confident. Stat 226 (Spring 2009, Section A Introduction to Business Statistics I Section 6.1 18 / 25 We saw that the two properties of a high level of confidence and but a narrow (precise CI work against each other. The higher the level of confidence the wider the confidence interval and therefore the less precision we have estimating the unknown µ. remedy: If we need a certain level of confidence, but also a specific precision, we can increase the sample size n if n goes up x = n will go down! we get a narrower interval with more precision: margin of error m = z n is also referred to as the so-called margin of error changing one of the three components z, or n in the margin of error will have the following impact on the width of the confidence interval 1 level of confidence C = (1 α 100% will change z Stat 226 (Spring 2009, Section A Introduction to Business Statistics I Section 6.1 19 / 25 Stat 226 (Spring 2009, Section A Introduction to Business Statistics I Section 6.1 20 / 25
2 sample size n will change standard deviation x sample size calculations If we want both a high level of confidence and a small margin of error (i.e. narrow confidence interval we need to take a sample of size 3 population standard deviation ( z 2 n m n rarely corresponds to an integer number, so we always need round up to the next largest integer. Why next largest? If we would round down, the corresponding confidence interval would not have the desired margin of error any longer, but a slightly larger one! Stat 226 (Spring 2009, Section A Introduction to Business Statistics I Section 6.1 21 / 25 Stat 226 (Spring 2009, Section A Introduction to Business Statistics I Section 6.1 22 / 25 Chapter 6.1 Assumptions for Confidence Intervals Example: What sample size should be used to estimate the mean age of workers in a large factory within 1 year at a 95% level of confidence if the standard deviation for the variable age is known to be 3.5? Necessary Assumptions for Constructing CIs 1 the sampling distribution of x has to follow at least approximately a distribution, i.e. either sample size is for the to apply if the population we sample from does not follow a normal distribution, or the population we sample from follows a normal distribution. 2 The sample taken has to be a sample. Stat 226 (Spring 2009, Section A Introduction to Business Statistics I Section 6.1 23 / 25 Stat 226 (Spring 2009, Section A Introduction to Business Statistics I Section 6.1 24 / 25
worksheets Stat 226 (Spring 2009, Section A Introduction to Business Statistics I Section 6.1 25 / 25