Chapter 8 Statistical Intervals for a Single Sample

Similar documents
Chapter 6.1 Confidence Intervals. Stat 226 Introduction to Business Statistics I. Chapter 6, Section 6.1

CHAPTER 8. Confidence Interval Estimation Point and Interval Estimates

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS

Confidence Intervals Introduction

Estimating parameters 5.3 Confidence Intervals 5.4 Sample Variance

Determining Sample Size. Slide 1 ˆ ˆ. p q n E = z α / 2. (solve for n by algebra) n = E 2

Statistical Intervals. Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Statistics Class 15 3/21/2012

Data Analysis and Statistical Methods Statistics 651

Interval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems

STAT Chapter 7: Confidence Intervals

19. CONFIDENCE INTERVALS FOR THE MEAN; KNOWN VARIANCE

Statistics for Business and Economics

Chapter 7 Sampling Distributions and Point Estimation of Parameters

LESSON 7 INTERVAL ESTIMATION SAMIE L.S. LY

Chapter 14 : Statistical Inference 1. Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same.

Chapter 4 Continuous Random Variables and Probability Distributions

Chapter 4 Continuous Random Variables and Probability Distributions

χ 2 distributions and confidence intervals for population variance

μ: ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics

Chapter 8 Estimation

1 Inferential Statistic

Statistical Intervals (One sample) (Chs )

Descriptive Statistics (Devore Chapter One)

8.1 Estimation of the Mean and Proportion

Learning Objectives for Ch. 7

As you draw random samples of size n, as n increases, the sample means tend to be normally distributed.

Chapter 7. Confidence Intervals and Sample Sizes. Definition. Definition. Definition. Definition. Confidence Interval : CI. Point Estimate.

Statistics 13 Elementary Statistics

AMS 7 Sampling Distributions, Central limit theorem, Confidence Intervals Lecture 4

Chapter 7.2: Large-Sample Confidence Intervals for a Population Mean and Proportion. Instructor: Elvan Ceyhan

The Two-Sample Independent Sample t Test

Data Analysis and Statistical Methods Statistics 651

MA 1125 Lecture 12 - Mean and Standard Deviation for the Binomial Distribution. Objectives: Mean and standard deviation for the binomial distribution.

Lecture 16: Estimating Parameters (Confidence Interval Estimates of the Mean)

Confidence Intervals for the Difference Between Two Means with Tolerance Probability

Contents. 1 Introduction. Math 321 Chapter 5 Confidence Intervals. 1 Introduction 1

THE UNIVERSITY OF TEXAS AT AUSTIN Department of Information, Risk, and Operations Management

Lecture 2 INTERVAL ESTIMATION II

1. Confidence Intervals (cont.)

Simple Random Sampling. Sampling Distribution

1 Small Sample CI for a Population Mean µ

Elementary Statistics Triola, Elementary Statistics 11/e Unit 14 The Confidence Interval for Means, σ Unknown

Data Analysis and Statistical Methods Statistics 651

Review of commonly missed questions on the online quiz. Lecture 7: Random variables] Expected value and standard deviation. Let s bet...

Chapter 4: Estimation

ECO220Y Estimation: Confidence Interval Estimator for Sample Proportions Readings: Chapter 11 (skip 11.5)

The Assumption(s) of Normality

Lecture 9 - Sampling Distributions and the CLT

Lecture 9 - Sampling Distributions and the CLT. Mean. Margin of error. Sta102/BME102. February 6, Sample mean ( X ): x i

Hypothesis Tests: One Sample Mean Cal State Northridge Ψ320 Andrew Ainsworth PhD

Basic Procedure for Histograms

Confidence Intervals for Large Sample Proportions

Chapter 5. Sampling Distributions

Confidence Intervals and Sample Size

MATH 10 INTRODUCTORY STATISTICS

. 13. The maximum error (margin of error) of the estimate for μ (based on known σ) is:

1 Introduction 1. 3 Confidence interval for proportion p 6

Estimation Y 3. Confidence intervals I, Feb 11,

Chapter 6 Confidence Intervals

Previously, when making inferences about the population mean, μ, we were assuming the following simple conditions:

1. Statistical problems - a) Distribution is known. b) Distribution is unknown.

Two Populations Hypothesis Testing

GETTING STARTED. To OPEN MINITAB: Click Start>Programs>Minitab14>Minitab14 or Click Minitab 14 on your Desktop

8.3 CI for μ, σ NOT known (old 8.4)

Tests for One Variance

Unit 5: Sampling Distributions of Statistics

Unit 5: Sampling Distributions of Statistics

Exam 2 Spring 2015 Statistics for Applications 4/9/2015

Homework: Due Wed, Feb 20 th. Chapter 8, # 60a + 62a (count together as 1), 74, 82

The Normal Distribution. (Ch 4.3)

5.3 Interval Estimation

Statistics 431 Spring 2007 P. Shaman. Preliminaries

Class 16. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Key Objectives. Module 2: The Logic of Statistical Inference. Z-scores. SGSB Workshop: Using Statistical Data to Make Decisions

Lecture 35 Section Wed, Mar 26, 2008

Data Analysis. BCF106 Fundamentals of Cost Analysis

1. Covariance between two variables X and Y is denoted by Cov(X, Y) and defined by. Cov(X, Y ) = E(X E(X))(Y E(Y ))

Chapter 8. Introduction to Statistical Inference

Chapter 7. Inferences about Population Variances

SLIDES. BY. John Loucks. St. Edward s University

Chapter 16. Random Variables. Copyright 2010 Pearson Education, Inc.

Data Analysis and Statistical Methods Statistics 651

Probability. An intro for calculus students P= Figure 1: A normal integral

Multiple-Choice Questions

A point estimate is a single value (statistic) used to estimate a population value (parameter).

CIVL Confidence Intervals

BIO5312 Biostatistics Lecture 5: Estimations

Homework: (Due Wed) Chapter 10: #5, 22, 42

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION

Chapter 16. Random Variables. Copyright 2010, 2007, 2004 Pearson Education, Inc.

STAT 201 Chapter 6. Distribution

Confidence Intervals for Paired Means with Tolerance Probability

FEEG6017 lecture: The normal distribution, estimation, confidence intervals. Markus Brede,

SYSM 6304 Risk and Decision Analysis Lecture 2: Fitting Distributions to Data

STA 320 Fall Thursday, Dec 5. Sampling Distribution. STA Fall

WebAssign Math 3680 Homework 5 Devore Fall 2013 (Homework)

Confidence Intervals

Statistics for Managers Using Microsoft Excel 7 th Edition

Time Observations Time Period, t

Transcription:

Chapter 8 Statistical Intervals for a Single Sample Part 1: Confidence intervals (CI) for population mean µ Section 8-1: CI for µ when σ 2 known & drawing from normal distribution Section 8-1.2: Sample size calculation for estimating µ with specified error, σ 2 known Section 8-2: CI for µ when σ 2 unknown & drawing from normal distribution 1 / 31

Confidence Intervals We end the last chapter with a phrase: Moving beyond point estimates Point estimates are a good start, but we should also give the client some idea of the confidence in our estimate. More data gives more information. We will have more confidence in an estimate for µ from an n = 50 sample, than an estimate from an n = 3 sample. The confidence in an estimate is related to the size (or width) of such an interval. 2 / 31

Confidence Interval for µ - Normal parent population, known σ 2 We use the observed x as the point estimate for µ. We provide a two-sided CI for µ as a window or interval for which we are fairly confident the unknown population mean µ lies. x will be at the center of our two-sided CIs [ x cushion, x + cushion] For example, suppose x = 8 and our cushion is 3 ( ) 4 5 6 7 8 9 10 11 12 x 3 / 31

Confidence Interval for µ - Normal parent population, known σ 2 ( ) 4 5 6 7 8 9 10 11 12 x We want to have high confidence that our interval contains µ. How do we choose this ± cushion so that we have high confidence that it contains µ? Or the length of our interval? We use the behavior (or probability distribution) of X... X N(µ, σ2 n ) for any sample size n to form our CI in such a way that we can say something very powerful, like... We are 95% confident that the true mean µ falls in this interval. 4 / 31

Confidence Interval for µ - Normal parent population, known σ 2 Right now, we are estimating µ and we say that we know σ 2. Perhaps not terribly realistic, but we will loosen this up later... Let X 1, X 2,..., X n be a random sample drawn from a normal distribution with X i N(µ, σ 2 ) for all i, then Z = X µ σ/ N(0, 1) n Using this probability distribution, we have P ( z 0.025 Z z 0.025 ) = 0.95 P ( z 0.025 X µ σ/ n z 0.025) = 0.95 where z 0.025 is the 97.5 th percentile of the standard normal (next slide). 5 / 31

Confidence Interval for µ - Normal parent population, known σ 2 P ( z 0.025 X µ σ/ n z 0.025) = 0.95 Manipulating what s inside the parentheses give us the Upper and Lower end-points for our 95% CI for µ... P ( X z 0.025 σ n µ X + z 0.025 σ n ) = 0.95 -z z NOTATION: z 0.025 is the z-value such that 97.5% of the distribution is below and 2.5% is above it (an upper tail z-value). 6 / 31

Confidence Interval for µ - Normal parent population, known σ 2 We can state the lower and upper end-points of the 95% CI for µ from a random sample of size n drawn from a normally distributed population with variance σ 2 and sample mean of x as: Lower end-point (L) = x z 0.025 σ n Upper end-point (U) = x + z 0.025 σ n NOTE: x lies in the center of the 2-sided confidence interval. 95% CI for µ when σ 2 known and drawing from a normally distributed population: Or... x z 0.025 σ n µ x + z 0.025 σ n x 1.96 σ n µ x + 1.96 σ n 7 / 31

Confidence Interval for µ - Normal parent population, known σ 2 Example (Fill weights of boxes) The sample mean for the fill weights of 100 boxes is x = 12.050. The population variance of the fill weights is known to be (0.100) 2. Find a 95% confidence interval for the population mean µ fill weight of the boxes. ANS: L = x z 0.025 σ n = 12.050 1.96 0.100 100 = 12.030. U = x + z 0.025 σ n = 12.050 + 1.96 0.100 100 = 12.070. The 95% confidence interval for µ is [12.030, 12.070]. We are 95% confident that the true parameter value lies in this interval. NOTE: Because σ 2 was very small and n was fairly large, we have a very narrow confidence interval for µ (which is good). 8 / 31

Confidence Interval for µ - Normal parent population, known σ 2 CI for any choice of confidence level, or 100(1-α)% confidence The confidence level of choice is stated as 100(1 α)%. For a 95% confidence interval, α = 0.05. For an 80% confidence interval, α = 0.20. We can re-write the earlier Z probability as P ( z α/2 X µ σ/ n z α/2) = 1 α and this leads to the 100(1 α)% confidence interval for µ P ( X z α/2 σ n µ X + z α/2 σ n ) = 1 α In a two-sided confidence interval, the α amount is split between the two tails, thus we see α/2 or specifically, z α/2 in the formula. 9 / 31

Confidence Interval for µ - Normal parent population, known σ 2 100(1-α)% Confidence interval on the mean, variance known If x is the sample mean of a random sample of size n from a normal population with known variance σ 2, a 100(1 α)% confidence interval for µ is given by x z α/2 σ n µ x + z α/2 σ n where z α/2 represents the z-value from the standard normal distribution with α/2 in the upper tail (e.g. if α =.05, z α/2 = z.025 = 1.96). Commonly used z scores Conf. Level α α/2 z α/2 90% 0.10 0.05 1.645 95% 0.05 0.025 1.96 99% 0.01 0.005 2.576 10 / 31

Confidence Interval for µ - Normal parent population, known σ 2 Example (Fill weights of boxes) The sample mean for the fill weights of 100 boxes is x = 12.050. The population variance of the fill weights is known to be (0.100) 2. Find a 80% confidence interval for the population mean µ fill weight of the boxes. ANS: L = x z 0.10 σ n = 12.050 1.28 0.100 100 = 12.037. U = x + z 0.10 σ n = 12.050 + 1.28 0.100 100 = 12.063. The 80% confidence interval for µ is [12.037, 12.063]. We are 80% confident that the true parameter value lies in this interval. NOTE: Because σ 2 was very small and n was fairly large, we have a very narrow confidence interval for µ (which is good). 11 / 31

Confidence Interval for µ - Normal parent population, known σ 2 Compare the 80% and 95% confidence intervals: The 80% confidence interval for µ is [12.037, 12.063] (The width of this interval is 0.026) The 95% confidence interval for µ is [12.030, 12.070] (The width of this interval is 0.040) The 95% CI is wider... i.e. All else being held constant, if you want to be more confident you capture µ, you ll have to make your net bigger. 12 / 31

Confidence Interval for µ - Normal parent population, known σ 2 Looking at the form of the confidence interval: x ± z α/2 σ n }{{} }{{} }{{} Sample multiplier value mean based on based on confidence σ and level changes for different % CI sample size standard error of the sample mean 13 / 31

Confidence Interval for µ - Normal parent population, known σ 2 More narrow CIs are desirable. How can this be achieved? x ± z α/2 σ n }{{} }{{} }{{} Sample multiplier value mean based on based on confidence σ and level sample size Increase your sample size (Good idea if possible) Decrease σ? Not an option, it s fixed by original distribution Decrease your confidence level? (Not a great idea. You reduce the CI width, but you re less likely to capture µ) 14 / 31

Confidence Interval Interpretation Once the confidence interval is formed (based on observed x), it either does or does not contain the fixed unknown value µ For example, the 95% CI for box fill weights was: [12.030, 12.070] and the true population mean either is or isn t in this interval. The confidence interval level arises based on the randomness of the interval. BEFORE we collect the data, the CI is a random interval and it could take on many different values due to the randomness of X. 15 / 31

Confidence Interval Interpretation For a 95% CI, we are 95% confident that the true µ lies in the interval. This statement of confidence reflects the following... If we repeated this process 100 times (i.e. collect a sample, compute x, compute the CI), 95 out of 100 times we will capture the true µ on average, in the long run. The confidence relates to the method used to calculate the CI. We don t know if our CI captured µ or not (µ is unknown), but using the same method, 95 out of 100 times I ll get it (on average). See confidence interval applet website: http://www.rossmanchance.com/applets/confsim.html 16 / 31

Confidence Interval Interpretation & Simulation 17 / 31

Sample Size Calculation for µ The length of the CI is a measure of precision of estimation. Precision is related to sample size n. Higher precision coincides with a larger sample size (all else being held constant). What sample size should you choose? (when you CAN choose) Let E be the error in estimating µ, distance of observed x from target. E= x µ Other books may state this error E as the Margin of Error. Choose a sample size that gives you a pre-specified level of precision. 18 / 31

Sample Size Calculation for µ Choose n to provide a certain bound on the error E with confidence 100(1 α). x ± z α/2 σ n } {{ } CI half-width or E Pre-specified error: E = z α/2 σ n n = ( ) zα/2 σ 2 E Sample size for estimating µ with 100(1-α)% confidence and error E: ( zα/2 σ n = E ) 2 19 / 31

Sample Size Calculation for µ Example (The fill weight example) In the fill weight example, how many boxes must be sampled to obtain a 99% confidence interval of full width 0.024 oz.? (i.e. E = 0.012) ANS: σ = 0.100 from before, and we want 99% CI, so α = 0.01 and z 0.005 = 2.576. Error E is set at 0.012 (half-width of CI). n = ( zα/2 σ E ) 2 = ( z0.01/2 0.100 0.012 ) 2 ( = 2.576 0.1 ) 2 0.012 = 460.8 We can t sample a fraction of a box, so we round-up to ensure our confidence level is at least 99%, thus the required sample size is n=461. NOTE: Read sample size problems closely to determine if they are giving precision as a half-width of a CI which is E (the cushion up or down), or the full width of the CI which is 2E. 20 / 31

One-sided Confidence Bounds for µ Occasionally, you may be interested in finding a bound for µ on only one side. A 100(1 α)% upper-confidence bound for µ is µ x + z α σ/ n and this gives an interval (, x + z α σ/ n). A 100(1 α)% lower-confidence bound for µ is (an upper bound on µ) x z α σ/ n µ and this gives an interval ( x z α σ/ n, ). (a lower bound on µ) 21 / 31

Confidence Interval for µ - Normal parent pop n, unknown σ 2 What if we don t know σ? Can I just plug-in my estimator for σ (or s) and again have the same 95% CI? ˆσ 2 = s 2 (xi x) 2 = n 1 This was a 95% CI for µ when σ was known σ σ x z 0.025 n µ x + z 0.025 n Is this a 95% CI for µ? s s x z 0.025 µ x + z 0.025 n n HINT: This feels like cheating. If I don t know σ, I must have more uncertainty in trying to capture µ than when I do know σ. So, how do we incorporate this extra uncertainty (for not knowing σ)? 22 / 31

Confidence Interval for µ - Normal parent pop n, unknown σ 2 The answer comes from the t-distribution. The 95% CI for µ when σ unknown s s x t 0.025,n 1 µ x + t 0.025,n 1 n n where... t 0.025,n 1 is the 97.5 th percentile of the t-distribution with n 1 degrees of freedom (next slide). s is the sample standard deviation s = (xi x) 2 n 1 n is the number of observations (the sample size) 23 / 31

Confidence Interval for µ - Normal parent pop n, unknown σ 2 What does the t-distribution look like? There is only one Z-distribution, but there are many t-distributions (distinguished by their degrees of freedom df as t df ). They look a lot like the N(0, 1), except they have heavier tails. For estimating a single parameter µ, the degrees of freedom is n 1. The heaviness of the tails depends on the degrees of freedom (the subscript on the t), so it depends on the sample size n. Differing t-distributions are shown below with df = k. 24 / 31

Confidence Interval for µ - Normal parent pop n, unknown σ 2 For a large sample size n, df = n 1 is very large, and the t n 1 looks just like the N(0, 1). So, Z N(0, 1) is the limiting distribution for t n 1 as n. 25 / 31

Confidence Interval for µ - Normal parent pop n, unknown σ 2 100(1-α)% Confidence interval for mean, variance unknown If x is the sample mean and s is the sample standard deviation of a random sample of size n from a normal population, a 100(1 α)% confidence interval for µ is given by x t α/2,df s s µ x + t n α/2,df n How do I get the t α/2,n 1 value? (next slide) 26 / 31

Confidence Interval for µ - Normal parent pop n, unknown σ 2 How do I get the t α/2,n 1 value? Similar to getting a z-value. A t-table can be found in your book p.745. When α = 0.05 (for 95% CI) and the sample size is n = 10, t α/2,n 1 = t 0.025,9 This is the t-value for a t 9 distribution with 2.5% above and 97.5% below. Looking at the table... t 0.025,9 = 2.262 27 / 31

Confidence Interval for µ - Normal parent pop n, unknown σ 2 Example (CI for µ using t-distribution) Suppose a sample of size n = 10 is taken from a normal population and x = 8.94 and s = 4.3. Construct a 95% CI for the population mean. Upper end-point: x + t α/2,n 1 s n = x + t 0.025,9 s n = 8.94 + 2.262 ( 4.3 10 ) = 10.02 Lower end-point: x t α/2,n 1 s n = 8.94 2.262 ( 4.3 10 ) = 5.86 The 95% confidence interval for µ is [5.86, 10.02]. We are 95% confident that the true mean µ is between 5.86 and 10.02. 28 / 31

Confidence Interval for µ - Normal parent pop n, unknown σ 2 Normality assumption for these t-based confidence intervals: When σ 2 is unknown and we have a rather small sample, we need the parent population to be normally distributed (or nearly normal) to truly achieve our 100(1 α)% confidence level. After we collect our data, we can check this assumption of normality by creating a normal probability plot (recall section 6-6). If the data are not normally distributed, we have to use a different approach. Something that doesn t depend on this normality assumption, such methods are called nonparametric methods (which we won t cover in in this class). 29 / 31

Confidence Interval for µ - Normal parent pop n, unknown σ 2 Connection to the Z-distribution... The Z random variable follows a N(0, 1) distribution Z = X µ σ/ n N(0, 1) A T random variable follows a t-distribution T = X µ S/ n t n 1 where t n 1 is a t-distribution with n 1 degrees of freedom. 30 / 31

100(1 α)% Confidence Interval for µ RULE OF THUMB When σ is known, use x z α/2 σ n µ x + z α/2 σ n When σ is unknown, use x t α/2,df s s µ x + t n α/2,df n When n is REALLY LARGE (n > 60) a 95% CI for µ can be s s x z 0.025 µ x + z 0.025 n n NOTE: At n = 60 the Z-table and t 60 -table are very very similar. But just use the rule of thumb, which says s goes with t and σ goes with z. 31 / 31