Elementary Statistics Triola, Elementary Statistics 11/e Unit 14 The Confidence Interval for Means, σ Unknown

Similar documents
Elementary Statistics

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

Statistical Intervals (One sample) (Chs )

1. Confidence Intervals (cont.)

CH 5 Normal Probability Distributions Properties of the Normal Distribution

Chapter 8 Statistical Intervals for a Single Sample

Lecture 2 INTERVAL ESTIMATION II

Probability. An intro for calculus students P= Figure 1: A normal integral

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION

Chapter 8 Estimation

FEEG6017 lecture: The normal distribution, estimation, confidence intervals. Markus Brede,

Statistics 13 Elementary Statistics

Statistical Intervals. Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Interval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems

Class 16. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Data Analysis and Statistical Methods Statistics 651

Frequency Distributions

Central Limit Theorem

AP Statistics Chapter 6 - Random Variables

The topics in this section are related and necessary topics for both course objectives.

Business Statistics 41000: Probability 4

Key Objectives. Module 2: The Logic of Statistical Inference. Z-scores. SGSB Workshop: Using Statistical Data to Make Decisions

Law of Large Numbers, Central Limit Theorem

Math 227 Elementary Statistics. Bluman 5 th edition

Confidence Intervals Introduction

ECON 214 Elements of Statistics for Economists 2016/2017

ECON 214 Elements of Statistics for Economists

January 29. Annuities

1 Inferential Statistic

Management and Operations 340: Exponential Smoothing Forecasting Methods

Lecture 16: Estimating Parameters (Confidence Interval Estimates of the Mean)

Confidence Intervals and Sample Size

7.1 Graphs of Normal Probability Distributions

A Derivation of the Normal Distribution. Robert S. Wilson PhD.

A.REPRESENTATION OF DATA

19. CONFIDENCE INTERVALS FOR THE MEAN; KNOWN VARIANCE

Statistical Methods in Practice STAT/MATH 3379

The Normal Probability Distribution

MgtOp S 215 Chapter 8 Dr. Ahn

Data Analysis and Statistical Methods Statistics 651

PROBABILITY DISTRIBUTIONS

μ: ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics

Section 7.5 The Normal Distribution. Section 7.6 Application of the Normal Distribution

CHAPTER 8. Confidence Interval Estimation Point and Interval Estimates

Focus Points 10/11/2011. The Binomial Probability Distribution and Related Topics. Additional Properties of the Binomial Distribution. Section 5.

Math 243 Lecture Notes

Abdul Latif Jameel Poverty Action Lab Executive Training: Evaluating Social Programs Spring 2009

Determining Sample Size. Slide 1 ˆ ˆ. p q n E = z α / 2. (solve for n by algebra) n = E 2

Math 140 Introductory Statistics. Next midterm May 1

The Binomial Distribution

Tuesday, Week 10. Announcements:

GETTING STARTED. To OPEN MINITAB: Click Start>Programs>Minitab14>Minitab14 or Click Minitab 14 on your Desktop

Using the Central Limit

MATH 264 Problem Homework I

The Binomial Distribution

MAKING SENSE OF DATA Essentials series

MATH 10 INTRODUCTORY STATISTICS

Graphing a Binomial Probability Distribution Histogram

What s Normal? Chapter 8. Hitting the Curve. In This Chapter

The Assumption(s) of Normality

5.1 Mean, Median, & Mode

LESSON 7 INTERVAL ESTIMATION SAMIE L.S. LY

Since his score is positive, he s above average. Since his score is not close to zero, his score is unusual.

Lecture 6: Chapter 6

Solutions for practice questions: Chapter 15, Probability Distributions If you find any errors, please let me know at

Data Analysis and Statistical Methods Statistics 651

Chapter Seven: Confidence Intervals and Sample Size

Adjusting Nominal Values to

CHAPTER 7 INTRODUCTION TO SAMPLING DISTRIBUTIONS

Descriptive Statistics (Devore Chapter One)

Data Analysis. BCF106 Fundamentals of Cost Analysis

The Normal Distribution

Copyright 2011 Pearson Education, Inc. Publishing as Addison-Wesley.

But suppose we want to find a particular value for y, at which the probability is, say, 0.90? In other words, we want to figure out the following:

Computing interest and composition of functions:

χ 2 distributions and confidence intervals for population variance

CS 237: Probability in Computing

Chapter 6 Confidence Intervals Section 6-1 Confidence Intervals for the Mean (Large Samples) Estimating Population Parameters

Every data set has an average and a standard deviation, given by the following formulas,

Chapter 5 Normal Probability Distributions

Chapter 6.1 Confidence Intervals. Stat 226 Introduction to Business Statistics I. Chapter 6, Section 6.1

MA 1125 Lecture 18 - Normal Approximations to Binomial Distributions. Objectives: Compute probabilities for a binomial as a normal distribution.

Data Analysis and Statistical Methods Statistics 651

Measures of Dispersion (Range, standard deviation, standard error) Introduction

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment

Expected Value of a Random Variable

Confidence Intervals for the Mean. When σ is known

Making Sense of Cents

Normal Probability Distributions

Sampling Distributions

When we look at a random variable, such as Y, one of the first things we want to know, is what is it s distribution?

Basic Procedure for Histograms

Lecture 9. Probability Distributions. Outline. Outline

THE UNIVERSITY OF TEXAS AT AUSTIN Department of Information, Risk, and Operations Management

Chapter 23: accuracy of averages

STAT 201 Chapter 6. Distribution

Statistics for Business and Economics: Random Variables:Continuous

11.5: Normal Distributions

WebAssign Math 3680 Homework 5 Devore Fall 2013 (Homework)

Lecture 9. Probability Distributions

Transcription:

Elementary Statistics We are now ready to begin our exploration of how we make estimates of the population mean. Before we get started, I want to emphasize the importance of having collected a representative sample, i.e. one that is a simple random sample. Without that, our estimates are useless. The best estimate of the mean that is available to us is the mean of our sample. However, we do not expect to equal therefore, this single estimate, while a good start is somewhat useless because we do not know how far off we are from. What we need is a Lower Bound and an Upper Bound in which we could have some confidence that falls between these two limits. In our words, we would like to find some value E, such that we are 95% confident that given the average, of any sample, lies somewhere between Even this definition is a big vague because what do we mean by confident. To sharpen things up a bit, suppose you were to repeatedly take sample of the same size from the population. For each sample, you would get an average, for the th sample. Now, we do not expect any of the to equal each other, but we want a single value for such that 95% of the will have the following property, will lie in the following interval, So if 95% of the samples we can take have this property, we can be 95% confident that the sample we did take has this property, i.e. will lie in the interval, First note, that we are working with averages, and That means that the probability distribution we will be working with is the sampling distribution of the mean. The mean of this distribution is and the standard deviation is According to the Central Limit Theorem,. be working with these values. and and so we will Now picture the sampling distribution with at its center. All possible are in the sampling distribution somewhere, and so if we find a value E such that the interval, which is centered on captures 95% of the area under the curve, it will also capture 95% of all the possible Take a look at the chart below. It is a chart of the Standard Normal Curve, and hence its center is 0. 56

Elementary Statistics There s a lot going on here, so let s take things one step at a time. The area of 0.95 is centered under the curve. The critical value,, is the boundary between the 0.95 area and the red zone to the right of it. Since we are looking at the graph of a Standard Normal Distribution, that value of equals 1.96. is called the significance, and it simply equals 1.0 Confidence Level (expressed as a decimal). Hence, in this case, In other words, the area of each red zone is 0.025 and together they sum to 0.05. How did we find that for Look at the chart above. Notice that the total area to the left of is 0.95 + 0.025 ( the area of the red zone on the left). Hence the total area to the left of is 0.975 and NORM.S.INV(0.975) = 1.96. Question #1 Now, you try one. Find the value of for an 80% confidence interval. First find Then, divide it by two. Add that value, to 0.80 and use NORM.S.INV find such that the area (called Probability in the dialog) to the left of is Write your answer on the Answer Sheet. Now, recall the formula for translating from the real world to the z-world, i.e. the axis of the Standard Normal Distribution, First, recognizing that we are working the sampling distribution, we rework the formula to reflect this, If we then use the Central Limit Theorem, we have that and so we get, Question #2 Let s try one. Let and write your answer on the Answer Sheet. The is the translated value of Since 95% of the area under the Standard Normal Curve lies between -1.96 and +1.96, that means that 95% of the must lie within this range as well. In other words, there is a 95% chance that the average of any given sample will lie between -1.96 and +1.96. Now, we re getting somewhere because we have just objectively stated what we mean by 95% confidence. All that remains to do is to translate z = 1.96 back into the real world, and we ll have our upper and lower bound on Remember, our goal is to find an E such that, 57

Elementary Statistics We start with the fact that there s a 95% chance that given any sample we ll have, Using the formula above, we translate and get, A little math and we have, and then, Multiplying through by and writing it in standard form, we get We have found our E, and in general, for any critical value,,i.e. any confidence interval, we have, This E is called the margin of error. Question #3 Find for a 99% confidence interval, and then assuming that Unfortunately, we are not much better off than before we started, because the value of E that we derived depends on knowing and if we don t know the value of (that is after all what we are trying to estimate) then why would we know This problem wasn t solved until around the turn of the 20 th century, when William Gosset, working for the Guinness Brewery company worked out a probability distribution that could be used to perform quality control tests using small samples. He called it the Student t distribution. (Nobody knows why he called it that.) The value of using this distribution in place of the Standard Normal Distribution used in the derivation above, is that now we can use s, the sample standard deviation, which we do know, instead of This was really a big deal. 58

This Student t distribution is very similar in shape to the Standard Normal distribution, except that it is wider, i.e. it has a larger standard deviation. Furthermore, the size of the standard deviation, and hence the width of the shape, depends on the sample size. The smaller the sample the larger the standard deviation. Take at a look at the following figure that shows different shapes for the t-distribution as a function of sample size, as well as comparing it to the Standard Normal distribution. Here are a few more rules for working with the t-distribution. If you know for a fact, or you strongly suspect (because you carefully examined the histogram of your sample) that the underlying population is normally distributed, then the sample size is not that important other than its effect on the shape of the t-curve. However, if you suspect that the underlying population is not all that normally shaped, then your sample size should be a minimum of 30. Otherwise, your results will not be reliable. Take a look at the Excel dialog for T.INV.2T, The entry for Probability is the significance (another bad label). The dialog uses a value of 0.05 for because we are trying to find the 95% Confidence Interval. If we wanted to find the 99% Confidence 59

Interval we would use a value of 0.01 ( Notice how this differs significantly from NORM.S.INV. Instead of inputting for Probability, we just input Now, note that there s something brand new here called the. No, this has nothing to do with the Tea Party. The degrees of freedom is simply one less than the sample size, Deg_ freedom So if the sample size, n, is 20 then the degrees of freedom (df) is 19. We use the T.INV.2T function to find the critical value that we use in place of the value 1.96 in the formula for E above, and now we can use s instead of And the confidence interval is, Remarkably, the formula above uses just information from our sample, the size n, the mean and the standard deviation, s. Worked Example We receive a batch of 50,000 washers, and we wish to estimate the average inside diameter of the washers. We carefully select a simple random sample of size 20 and find that the average inside diameter is 24.78mm with a standard deviation of 1.62mm. We want to calculate a 95% confidence interval for our estimate of the batch mean. First we calculate using T.INV.2T, We see that and we proceed to calculate E, ( ( Finally, the confidence interval is, ( Below is the Excel spreadsheet that you can use to calculate these values. If you double click on the table, you will bring up a copy of Excel. Then if you select any of the cells, such as the value for t, you will see the Excel formula in the formula bar, toward the top. Also, try clicking on the value for E to see its Excel formula. 60

Finding a confidence interval for a mean, σ unkown. x s n α t E x-e x+e 24.78 1.62 20 0.05 2.093024 0.758183 24.02182 25.53818 The cells in blue can t be changed by you. That s so you can t accidently screw up the formulas. Also, I ve noticed that when you have been working with these embedded Excel sheets, things start getting screwed up. I believe that s a problem with the operating system. If you suspect that something weird is going on, just close the unit notes, and download a fresh copy. Question #4 One last note. Suppose that the manufacturer of the washers had claimed that the average inside diameter was 25.00mm. On the basis of this sample, could you refute the claim? Question #5 Now it s your turn to have some fun. Use the embedded Excel sheet above. Assume the population is normally distributed. For a sample size of 61, the average weight loss was 4.0 kg with a standard deviation of 6.4 kg. Find a 99% confidence interval for the mean of the population and enter your answer on the Answer Sheet. This is the end of Unit 14. Now turn to Unit 14 homework in your MyMathLab to get more practice with these concepts. 61

Answer Sheet 1. Name 2. 3. 4. 5. 62