Lecture 2 INTERVAL ESTIMATION II

Similar documents
Data Analysis and Statistical Methods Statistics 651

Chapter 4: Estimation

Lecture 10 - Confidence Intervals for Sample Means

Determining Sample Size. Slide 1 ˆ ˆ. p q n E = z α / 2. (solve for n by algebra) n = E 2

μ: ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics

Data Analysis and Statistical Methods Statistics 651

8.1 Estimation of the Mean and Proportion

Statistical Intervals (One sample) (Chs )

Statistical Intervals. Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Confidence Intervals. σ unknown, small samples The t-statistic /22

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION

ECON 214 Elements of Statistics for Economists 2016/2017

ECON 214 Elements of Statistics for Economists

Elementary Statistics

The topics in this section are related and necessary topics for both course objectives.

Chapter 8 Statistical Intervals for a Single Sample

Chapter Seven: Confidence Intervals and Sample Size

Chapter 14 : Statistical Inference 1. Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same.

Experimental Design and Statistics - AGA47A

Interval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems

The Central Limit Theorem. Sec. 8.2: The Random Variable. it s Distribution. it s Distribution

Elementary Statistics Triola, Elementary Statistics 11/e Unit 14 The Confidence Interval for Means, σ Unknown

Distribution. Lecture 34 Section Fri, Oct 31, Hampden-Sydney College. Student s t Distribution. Robb T. Koether.

χ 2 distributions and confidence intervals for population variance

If the distribution of a random variable x is approximately normal, then

The Normal Probability Distribution

Estimation and Confidence Intervals

Chapter 8 Estimation

CHAPTER 8. Confidence Interval Estimation Point and Interval Estimates

1. Confidence Intervals (cont.)

Chapter 4 Continuous Random Variables and Probability Distributions

6.1, 7.1 Estimating with confidence (CIS: Chapter 10)

Homework: Due Wed, Feb 20 th. Chapter 8, # 60a + 62a (count together as 1), 74, 82

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 7.4-1

Continuous Probability Distributions & Normal Distribution

Chapter 4 Continuous Random Variables and Probability Distributions

. 13. The maximum error (margin of error) of the estimate for μ (based on known σ) is:

FEEG6017 lecture: The normal distribution, estimation, confidence intervals. Markus Brede,

Chapter 7. Sampling Distributions

Data Analysis and Statistical Methods Statistics 651

Homework: Due Wed, Nov 3 rd Chapter 8, # 48a, 55c and 56 (count as 1), 67a

1 Inferential Statistic

Statistics for Business and Economics

Confidence Intervals Introduction

Hypothesis Tests: One Sample Mean Cal State Northridge Ψ320 Andrew Ainsworth PhD

Statistics 13 Elementary Statistics

Chapter 9 Chapter Friday, June 4 th

Lecture 12. Some Useful Continuous Distributions. The most important continuous probability distribution in entire field of statistics.

Class 16. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Confidence Intervals and Sample Size

THE UNIVERSITY OF TEXAS AT AUSTIN Department of Information, Risk, and Operations Management

Previously, when making inferences about the population mean, μ, we were assuming the following simple conditions:

A Single Population Mean using the Student t Distribution

Chapter 6 Confidence Intervals Section 6-1 Confidence Intervals for the Mean (Large Samples) Estimating Population Parameters

Normal Model (Part 1)

STAT Chapter 7: Confidence Intervals

Chapter ! Bell Shaped

MAKING SENSE OF DATA Essentials series

Learning Objectives for Ch. 7

LESSON 7 INTERVAL ESTIMATION SAMIE L.S. LY

Two Populations Hypothesis Testing

Business Statistics 41000: Probability 3

Introduction to Statistics I

Review of commonly missed questions on the online quiz. Lecture 7: Random variables] Expected value and standard deviation. Let s bet...

Statistics for Managers Using Microsoft Excel 7 th Edition

Contents. The Binomial Distribution. The Binomial Distribution The Normal Approximation to the Binomial Left hander example

MgtOp S 215 Chapter 8 Dr. Ahn

Lecture 2 Describing Data

Section 8.1 Estimating μ When σ is Known

No, because np = 100(0.02) = 2. The value of np must be greater than or equal to 5 to use the normal approximation.

Sampling Distribution

Lecture 2. Probability Distributions Theophanis Tsandilas

Statistical Methods in Practice STAT/MATH 3379

STAT Chapter 5: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Lecture 6: Confidence Intervals

Confidence Intervals for the Mean. When σ is known

ECO220Y Continuous Probability Distributions: Normal Readings: Chapter 9, section 9.10

5.3 Interval Estimation

STAT Chapter 6: Sampling Distributions

Homework: (Due Wed) Chapter 10: #5, 22, 42

Statistics, Measures of Central Tendency I

Statistics 431 Spring 2007 P. Shaman. Preliminaries

Data Analysis and Statistical Methods Statistics 651

Contents. 1 Introduction. Math 321 Chapter 5 Confidence Intervals. 1 Introduction 1

Since his score is positive, he s above average. Since his score is not close to zero, his score is unusual.

Chapter 3. Descriptive Measures. Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 3, Slide 1

Statistics 511 Supplemental Materials

Data Analysis and Statistical Methods Statistics 651

Business Statistics 41000: Probability 4

CIVL Confidence Intervals

8.3 CI for μ, σ NOT known (old 8.4)

ECE 295: Lecture 03 Estimation and Confidence Interval

Moments and Measures of Skewness and Kurtosis

Normal Probability Distributions

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals

MATH 10 INTRODUCTORY STATISTICS

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS

Chapter 7 Sampling Distributions and Point Estimation of Parameters

Module 4: Probability

Transcription:

Lecture 2 INTERVAL ESTIMATION II

Recap Population of interest - want to say something about the population mean µ perhaps Take a random sample...

Recap When our random sample follows a normal distribution, or indeed any distribution (if the sample size is large), then the sample mean X N(µ,σ 2 /n).

Recap When our random sample follows a normal distribution, or indeed any distribution (if the sample size is large), then the sample mean X N(µ,σ 2 /n). Slide/squash to give Z = X µ σ 2 /n, where Z is the standard normal distribution, i.e. Z N(0,1).

Recap When our random sample follows a normal distribution, or indeed any distribution (if the sample size is large), then the sample mean X N(µ,σ 2 /n). Slide/squash to give Z = X µ σ 2 /n, where Z is the standard normal distribution, i.e. Z N(0,1). Consequently, Pr( 1.96 < Z < 1.96) = 0.95 so that Pr( 1.96 < X µ < 1.96) = 0.95 σ 2 /n

Recap Rearranging the inequality for µ gives the 95% confidence interval as ( x 1.96 σ 2 /n, x +1.96 σ 2 /n) which we can write concisely as x ±1.96 σ 2 /n.

Recap Rearranging the inequality for µ gives the 95% confidence interval as ( x 1.96 σ 2 /n, x +1.96 σ 2 /n) which we can write concisely as x ±1.96 σ 2 /n. We have assumed that the population variance σ 2 is known! What if it isn t? (It typically isn t known in practice!)

Recap Rearranging the inequality for µ gives the 95% confidence interval as ( x 1.96 σ 2 /n, x +1.96 σ 2 /n) which we can write concisely as x ±1.96 σ 2 /n. We have assumed that the population variance σ 2 is known! What if it isn t? (It typically isn t known in practice!) Grrrr!

How can we proceed? We could calculate the sample variance which we denote by s 2.

How can we proceed? We could calculate the sample variance which we denote by s 2. We could then estimate σ 2 with s 2.

How can we proceed? We could calculate the sample variance which we denote by s 2. We could then estimate σ 2 with s 2. Thus, we can think about the quantity X µ S 2 /n and what distribution this quantity might follow.

How can we proceed? We could calculate the sample variance which we denote by s 2. We could then estimate σ 2 with s 2. Thus, we can think about the quantity X µ S 2 /n and what distribution this quantity might follow. We will call this quantity T for reasons that will become clear from the next slide!

Case 2: Unknown variance σ 2 If the population variance is unknown (which is usually the case), the quantity T = X µ S 2 /n does not have a N(0,1) distribution, but a Student s t distribution.

Case 2: Unknown variance σ 2 If the population variance is unknown (which is usually the case), the quantity T = X µ S 2 /n does not have a N(0,1) distribution, but a Student s t distribution. This is similar to the normal distribution (i.e. symmetric and bell shaped), but is more heavily tailed ;

Case 2: Unknown variance σ 2 If the population variance is unknown (which is usually the case), the quantity T = X µ S 2 /n does not have a N(0,1) distribution, but a Student s t distribution. This is similar to the normal distribution (i.e. symmetric and bell shaped), but is more heavily tailed ; The t distribution has one parameter, called the degrees of freedom (ν = n 1).

Case 2: Unknown variance σ 2 If the population variance is unknown (which is usually the case), the quantity T = X µ S 2 /n does not have a N(0,1) distribution, but a Student s t distribution. This is similar to the normal distribution (i.e. symmetric and bell shaped), but is more heavily tailed ; The t distribution has one parameter, called the degrees of freedom (ν = n 1). A picture (using the space on page 7) will help!

comparison of Normal and T distributions 10 5 0 5 10

comparison of Normal and T distributions 10 5 0 5 10

Student s t distribution a brief history Takes its name from William Sealy Gosset s 1908 paper in Biometrika under the pseudonym Student. Gosset worked at the Guinness Brewery in Dublin, Ireland, and was interested in the chemical properties of barley where sample sizes might be small. One version of the origin of the pseudonym is that Gosset s employer forbade members of its staff from publishing scientific papers, so he had to hide his identity. Another is that Guinness did not want their competitors to know that they were using the t to test the quality of raw material.

William Sealy Gosset 1876 1937

Back to our problem... So if we don t know σ 2, the formula for the confidence interval becomes: x ±t p/2 s 2 /n

Back to our problem... So if we don t know σ 2, the formula for the confidence interval becomes: x ±t p/2 s 2 /n where t p/2 is the value such that Pr( t p/2 < T < t p/2 ) = 100(1 p)%. We find t p/2 from statistical tables (table 1.1 in the notes). We read along the p column and down the ν row.

Back to our problem... So if we don t know σ 2, the formula for the confidence interval becomes: x ±t p/2 s 2 /n where t p/2 is the value such that Pr( t p/2 < T < t p/2 ) = 100(1 p)%. We find t p/2 from statistical tables (table 1.1 in the notes). We read along the p column and down the ν row. For a 90% confidence interval, p = 10%. For a 95% confidence interval, p = 5%. For a 99% confidence interval, p = 1%. The degrees of freedom, ν = n 1.

Example (page 8) A sample of size 15 is taken from a larger population; the sample mean is calculated as 12 and the sample variance as 25. What is the 95% confidence interval for the population mean µ?

Example (page 8) We know that the confidence interval is given by x ± t p/2 s 2 /n, where

Example (page 8) We know that the confidence interval is given by x ± t p/2 s 2 /n, where n = 15,

Example (page 8) We know that the confidence interval is given by x ± t p/2 s 2 /n, where n = 15, x = 12 and

Example (page 8) We know that the confidence interval is given by x ± t p/2 s 2 /n, where n = 15, x = 12 s 2 = 25. and Also, to find t, we know that ν = n 1 = 15 1 = 14 and

Example (page 8) We know that the confidence interval is given by x ± t p/2 s 2 /n, where n = 15, x = 12 s 2 = 25. and Also, to find t, we know that ν = n 1 = 15 1 = 14 p = 5%. and

Example (page 8) We can find our t value by looking in the p = 5% column and the ν = 14 row, giving a value of 2.145. Putting what we know into our expression, we get 25 12 ± t 2.5% 15

Example (page 8) We can find our t value by looking in the p = 5% column and the ν = 14 row, giving a value of 2.145. Putting what we know into our expression, we get 25 12 ± t 2.5% 15 25 12 ± 2.145 15 i.e.

Example (page 8) We can find our t value by looking in the p = 5% column and the ν = 14 row, giving a value of 2.145. Putting what we know into our expression, we get 25 12 ± t 2.5% 15 25 12 ± 2.145 15 12 ± 2.77. i.e.

Example (page 8) We can find our t value by looking in the p = 5% column and the ν = 14 row, giving a value of 2.145. Putting what we know into our expression, we get 25 12 ± t 2.5% 15 25 12 ± 2.145 15 12 ± 2.77. i.e. Hence, the confidence interval is (9.23, 14.77).

Write this down! (Bottom page 8) It is claimed that µ = 9. Is this justified?

Write this down! (Bottom page 8) It is claimed that µ = 9. Is this justified? No! The claimed value of 9 does NOT lie within (9.23,14.77).

Confidence intervals: a general approach We now summarise the general procedure for calculating a confidence interval for the population mean µ.

Confidence intervals: a general approach We now summarise the general procedure for calculating a confidence interval for the population mean µ. Case 1: Known population variance σ 2 (i) Calculate the sample mean x from the data;

Confidence intervals: a general approach We now summarise the general procedure for calculating a confidence interval for the population mean µ. Case 1: Known population variance σ 2 (i) Calculate the sample mean x from the data; (ii) Calculate your interval! For example,

Confidence intervals: a general approach We now summarise the general procedure for calculating a confidence interval for the population mean µ. Case 1: Known population variance σ 2 (i) Calculate the sample mean x from the data; (ii) Calculate your interval! For example, for a 90% confidence interval, use the formula x ±1.645 σ 2 /n;

Confidence intervals: a general approach We now summarise the general procedure for calculating a confidence interval for the population mean µ. Case 1: Known population variance σ 2 (i) Calculate the sample mean x from the data; (ii) Calculate your interval! For example, for a 90% confidence interval, use the formula x ±1.645 σ 2 /n; for a 95% confidence interval, use the formula x ±1.96 σ 2 /n;

Confidence intervals: a general approach We now summarise the general procedure for calculating a confidence interval for the population mean µ. Case 1: Known population variance σ 2 (i) Calculate the sample mean x from the data; (ii) Calculate your interval! For example, for a 90% confidence interval, use the formula x ±1.645 σ 2 /n; for a 95% confidence interval, use the formula x ±1.96 σ 2 /n; for a 99% confidence interval, use the formula x ±2.576 σ 2 /n.

Confidence intervals: a general approach Case 2: Unknown population variance σ 2

Confidence intervals: a general approach Case 2: Unknown population variance σ 2 (i) Calculate the sample mean x and the sample variance s 2 from the data;

Confidence intervals: a general approach Case 2: Unknown population variance σ 2 (i) Calculate the sample mean x and the sample variance s 2 from the data; (ii) For a 100(1 p)% confidence interval, look up the value of t under column p, row ν of table 1.1, remembering that ν = n 1. Note that, for a 90% confidence interval, p = 10%, for a 95% confidence interval, p = 5% and for a 99% confidence interval, p = 1%;

Confidence intervals: a general approach Case 2: Unknown population variance σ 2 (i) Calculate the sample mean x and the sample variance s 2 from the data; (ii) For a 100(1 p)% confidence interval, look up the value of t under column p, row ν of table 1.1, remembering that ν = n 1. Note that, for a 90% confidence interval, p = 10%, for a 95% confidence interval, p = 5% and for a 99% confidence interval, p = 1%; (iii) Calculate your interval, using x ±t p/2 s 2 /n.

Application of Confidence Intervals You might be asking: why do we bother calculating confidence intervals?.

Application of Confidence Intervals You might be asking: why do we bother calculating confidence intervals?. By calculating a confidence interval for the population mean, it allows us to see how confident we are of the point estimate we have calculated. The wider the range, the less precise we can be about the population value.

Application of Confidence Intervals You might be asking: why do we bother calculating confidence intervals?. By calculating a confidence interval for the population mean, it allows us to see how confident we are of the point estimate we have calculated. The wider the range, the less precise we can be about the population value. If we have a known (or target) value for a population and this does not fall within the confidence interval of our sample, this could suggest that there is something different about this sample.

Application of Confidence Intervals You might be asking: why do we bother calculating confidence intervals?. By calculating a confidence interval for the population mean, it allows us to see how confident we are of the point estimate we have calculated. The wider the range, the less precise we can be about the population value. If we have a known (or target) value for a population and this does not fall within the confidence interval of our sample, this could suggest that there is something different about this sample. It allows us to start looking at differences between groups. If the confidence intervals for two samples do not overlap, this could suggest that they are from separate populations.

Example (page 9) A credit card company wants to determine the mean income of its card holders. It also wants to find out if there are any differences in mean income between males and females.

Example (page 9) A random sample of 225 male card holders and 190 female card holders was drawn, and the following results obtained: Mean Standard deviation Males 16 450 3675 Females 13 220 3050 Calculate 95% confidence intervals for the mean income for males and females. Is there any evidence to suggest that, on average, males and females incomes differ? If so, describe this difference.

Example (page 9) 95% confidence interval for male income The true population variance, σ 2, is unknown, and so we have case 2 and need to use the t distribution. Thus,

Example (page 9) 95% confidence interval for male income The true population variance, σ 2, is unknown, and so we have case 2 and need to use the t distribution. Thus, x ±t p/2 s 2 /n.

Example (page 9) 95% confidence interval for male income The true population variance, σ 2, is unknown, and so we have case 2 and need to use the t distribution. Thus, x ±t p/2 s 2 /n. Here, x = 16450, s 2 = 3675 2 = 13505625 n = 225. and

Example (page 9) The value t p/2 must be found from table 1.1.

Example (page 9) The value t p/2 must be found from table 1.1. Recall that the degrees of freedom, ν = n 1, and so here we have ν = 225 1 = 224;

Example (page 9) The value t p/2 must be found from table 1.1. Recall that the degrees of freedom, ν = n 1, and so here we have ν = 225 1 = 224; But table 1.1 only gives value of ν up to 29; for higher values, we use the row;

Example (page 9) The value t p/2 must be found from table 1.1. Recall that the degrees of freedom, ν = n 1, and so here we have ν = 225 1 = 224; But table 1.1 only gives value of ν up to 29; for higher values, we use the row; Since we require a 95% confidence interval, we read down the 5% column, giving a t value of 1.96.

Example (page 9) Thus, the 95% confidence interval for µ is found as

Example (page 9) Thus, the 95% confidence interval for µ is found as 16450 ± 1.96 13505625/225, i.e.

Example (page 9) Thus, the 95% confidence interval for µ is found as 16450 ± 1.96 13505625/225, i.e. 16450 ± 480.2.

Example (page 9) Thus, the 95% confidence interval for µ is found as 16450 ± 1.96 13505625/225, i.e. 16450 ± 480.2. So, the 95% confidence interval is ( 15969.80, 16930.20).

Example (page 10) 95% confidence interval for female income Again, the true population variance, σ 2, is unknown, and so we have case 2. Thus,

Example (page 10) 95% confidence interval for female income Again, the true population variance, σ 2, is unknown, and so we have case 2. Thus, x ±t p/2 s 2 /n.

Example (page 10) 95% confidence interval for female income Again, the true population variance, σ 2, is unknown, and so we have case 2. Thus, x ±t p/2 s 2 /n. Now, x = 13220, s 2 = 3050 2 = 9302500, and n = 190.

Example (page 10) Again, since the sample size is large, we use the row of table 1.1 to obtain the value of t p/2, giving:

Example (page 10) Again, since the sample size is large, we use the row of table 1.1 to obtain the value of t p/2, giving: 13220 ± 1.96 9302500/190, i.e.

Example (page 10) Again, since the sample size is large, we use the row of table 1.1 to obtain the value of t p/2, giving: 13220 ± 1.96 9302500/190, i.e. 13220 ± 1.96 221.27, i.e.

Example (page 10) Again, since the sample size is large, we use the row of table 1.1 to obtain the value of t p/2, giving: 13220 ± 1.96 9302500/190, i.e. 13220 ± 1.96 221.27, i.e. 13220 ± 433.69.

Example (page 10) Again, since the sample size is large, we use the row of table 1.1 to obtain the value of t p/2, giving: 13220 ± 1.96 9302500/190, i.e. 13220 ± 1.96 221.27, i.e. 13220 ± 433.69. So, the 95% confidence interval is ( 12786.31, 13653.69).

Example (page 10) Since the 95% confidence intervals for males and females do not overlap, there is evidence to suggest that males and females incomes, on average, are different. Further, it appears that male card holders earn more than women. But note that the dataset is rather old...