Statistical Intervals (One sample) (Chs )

Similar documents
Statistical Intervals. Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

χ 2 distributions and confidence intervals for population variance

CHAPTER 8. Confidence Interval Estimation Point and Interval Estimates

Confidence Intervals. σ unknown, small samples The t-statistic /22

STAT Chapter 7: Confidence Intervals

Statistics for Business and Economics

Chapter 8 Estimation

The Normal Probability Distribution

Confidence Intervals Introduction

8.1 Estimation of the Mean and Proportion

Normal Probability Distributions

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Business Statistics 41000: Probability 3

STAT Chapter 5: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.

6 Central Limit Theorem. (Chs 6.4, 6.5)

The Normal Distribution. (Ch 4.3)

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

ECON 214 Elements of Statistics for Economists 2016/2017

Probability. An intro for calculus students P= Figure 1: A normal integral

. 13. The maximum error (margin of error) of the estimate for μ (based on known σ) is:

Point Estimation. Principle of Unbiased Estimation. When choosing among several different estimators of θ, select one that is unbiased.

Determining Sample Size. Slide 1 ˆ ˆ. p q n E = z α / 2. (solve for n by algebra) n = E 2

Chapter 8 Statistical Intervals for a Single Sample

STAT Chapter 6: Sampling Distributions

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION

19. CONFIDENCE INTERVALS FOR THE MEAN; KNOWN VARIANCE

μ: ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics

AP Statistics Chapter 6 - Random Variables

Math 227 Elementary Statistics. Bluman 5 th edition

Lecture 2 INTERVAL ESTIMATION II

Chapter 7. Sampling Distributions

Estimation and Confidence Intervals

Interval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems

Applied Statistics I

Statistics and Probability

Hypothesis Tests: One Sample Mean Cal State Northridge Ψ320 Andrew Ainsworth PhD

Key Objectives. Module 2: The Logic of Statistical Inference. Z-scores. SGSB Workshop: Using Statistical Data to Make Decisions

LESSON 7 INTERVAL ESTIMATION SAMIE L.S. LY

Business Statistics 41000: Probability 4

Statistics 13 Elementary Statistics

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 7.4-1

Elementary Statistics Triola, Elementary Statistics 11/e Unit 14 The Confidence Interval for Means, σ Unknown

Making Sense of Cents

Chapter 7 1. Random Variables

Chapter 6 Confidence Intervals Section 6-1 Confidence Intervals for the Mean (Large Samples) Estimating Population Parameters

Chapter 6.1 Confidence Intervals. Stat 226 Introduction to Business Statistics I. Chapter 6, Section 6.1

Chapter 7 Study Guide: The Central Limit Theorem

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

Unit 5: Sampling Distributions of Statistics

Exam 2 Spring 2015 Statistics for Applications 4/9/2015

Unit 5: Sampling Distributions of Statistics

Statistics 431 Spring 2007 P. Shaman. Preliminaries

Part V - Chance Variability

Chapter 7.2: Large-Sample Confidence Intervals for a Population Mean and Proportion. Instructor: Elvan Ceyhan

CH 5 Normal Probability Distributions Properties of the Normal Distribution

Introduction to Business Statistics QM 120 Chapter 6

Confidence Intervals and Sample Size

Estimating parameters 5.3 Confidence Intervals 5.4 Sample Variance

Confidence Intervals for the Mean. When σ is known

Chapter 6 Confidence Intervals

Lecture 6: Chapter 6

MATH 3200 Exam 3 Dr. Syring

Lecture 2. Probability Distributions Theophanis Tsandilas

The Normal Distribution

Week 7. Texas A& M University. Department of Mathematics Texas A& M University, College Station Section 3.2, 3.3 and 3.4

The topics in this section are related and necessary topics for both course objectives.

Lecture 9. Probability Distributions. Outline. Outline

STAT Chapter 5: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.

Midterm Exam III Review

Homework: (Due Wed) Chapter 10: #5, 22, 42

Statistical Methods in Practice STAT/MATH 3379

Chapter Seven: Confidence Intervals and Sample Size

Lecture 23. STAT 225 Introduction to Probability Models April 4, Whitney Huang Purdue University. Normal approximation to Binomial

Lecture 9. Probability Distributions

Previously, when making inferences about the population mean, μ, we were assuming the following simple conditions:

1 Small Sample CI for a Population Mean µ

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution

Normal Distribution. Notes. Normal Distribution. Standard Normal. Sums of Normal Random Variables. Normal. approximation of Binomial.

Distribution. Lecture 34 Section Fri, Oct 31, Hampden-Sydney College. Student s t Distribution. Robb T. Koether.

Statistics Class 15 3/21/2012

Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc.

3. Probability Distributions and Sampling

Lecture 9 - Sampling Distributions and the CLT

ECON 214 Elements of Statistics for Economists

Chapter 4: Estimation

Continuous Probability Distributions & Normal Distribution

Econ 6900: Statistical Problems. Instructor: Yogesh Uppal

The normal distribution is a theoretical model derived mathematically and not empirically.

Introduction to Statistics I

Expected Value of a Random Variable

5.3 Statistics and Their Distributions

Section 7.5 The Normal Distribution. Section 7.6 Application of the Normal Distribution

Commonly Used Distributions

Statistical Tables Compiled by Alan J. Terry

1 Inferential Statistic

Discrete Probability Distribution

2. The sum of all the probabilities in the sample space must add up to 1

Statistics (This summary is for chapters 17, 28, 29 and section G of chapter 19)

Chapter 5 Discrete Probability Distributions. Random Variables Discrete Probability Distributions Expected Value and Variance

Transcription:

7 Statistical Intervals (One sample) (Chs 8.1-8.3)

Confidence Intervals The CLT tells us that as the sample size n increases, the sample mean X is close to normally distributed with expected value µ and standard deviation Standardizing X by first subtracting its expected value and then dividing by its standard deviation yields the standard normal variable How big does our sample need to be if the underlying population is normally distributed? 2

Confidence Intervals Because the area under the standard normal curve between 1.96 and 1.96 is.95, we know: This is equivalent to: 3

Confidence Intervals The interval Is called the 95% confidence interval for the mean. This interval varies from sample to sample, as the sample mean varies. So, the interval itself is a random interval. 4

Confidence Intervals The CI interval is centered at the sample mean X and extends 1.96 to each side of X. The interval s width is 2 (1.96), which is not random;; only the location of the interval (its midpoint X) is random. 5

Confidence Intervals As we showed, for a given sample, the CI can be expressed as A concise expression for the interval is x ± 1.96 / p n where the left endpoint is the lower limit and the right endpoint is the upper limit. 6

Interpreting a Confidence Level We are 95% confident that the true parameter is in this interval What does that mean?? 7

Interpreting a Confidence Level A correct interpretation of 95% confidence relies on the long-run relative frequency interpretation of probability. In repeated sampling, 95% of the confidence intervals obtained from all samples will actually contain µ. The other 5% of the intervals will not. The confidence level is not a statement about any particular interval instead it pertains to what would happen if a very large number of like intervals were to be constructed using the same CI formula. 8

Other Levels of Confidence Probability of 1 α is achieved by using z α/2 in place of z.025 = 1.96 P ( z /2 apple Z apple z /2 )=1 where Z = X µ / p n 9

Other Levels of Confidence A 100(1 α)% confidence interval for the mean µ when the value of σ is known is given by or, equivalently, by 10

Example A sample of 40 units is selected and diameter measured for each one. The sample mean diameter is 5.426 mm, and the standard deviation of measurements is 0.1mm. Let s calculate a confidence interval for true average hole diameter using a confidence level of 90%. What is the width of the interval? What about the 99% confidence interval? What are the advantages and disadvantages to a wider confidence interval? 11

Sample size computation For each desired confidence level and interval width, we can determine the necessary sample size. Example: A response time is normally distributed with standard deviation of 25 milliseconds. A new system has been installed, and we wish to estimate the true average response time µ for the new environment. Assuming that response times are still normally distributed with σ = 25, what sample size is necessary to ensure that the resulting 95% CI has a width of (at most) 10? 12

Unknown variance A difficulty in using our previous equation for confidence intervals is that it uses the value of σ, which will rarely be known. 13

Unknown variance A difficulty in using our previous equation for confidence intervals is that it uses the value of σ, which will rarely be known. In this instance, we need to work with the sample standard deviation s. Remember from our first lesson that the standard deviation is calculated as:! = #! $ = # (& ' &) $ * 1 14

Unknown variance A difficulty in using our previous equation for confidence intervals is that it uses the value of σ, which will rarely be known. In this instance, we need to work with the sample standard deviation s. Remember from our first lesson that the standard deviation is calculated as:! = #! $ = # (& ' &) $ * 1 With this, we instead work with the standardized random variable: 15

Unknown mean and variance Previously, there was randomness only in the numerator of Z by virtue of, the estimator. In the new standardized variable, both value from one sample to another. and s vary in When n is large, the substitution of s for σ adds little extra variability, so nothing needs to change. When n is smaller, the distribution of this new variable should be wider than the normal to reflect the extra uncertainty. (We talk more about this in a bit.) 16

A Large-Sample Interval for µ Proposition If n is sufficiently large (n>=30), the standardized random variable has approximately a standard normal distribution. This implies that is a large-sample confidence interval for µ with confidence level approximately 100(1 α)%. This formula is valid regardless of the population distribution for sufficiently large n. 17

n >= 30 n < 30 Underlying normal distribution Underlying non-normal distribution 18

n >= 30 n < 30 Underlying normal distribution Underlying non-normal distribution 19

n >= 30 n < 30 Underlying normal distribution Underlying non-normal distribution 20

n >= 30 n < 30 Underlying normal distribution Underlying non-normal distribution 21

A Small-Sample Interval for µ The CLT cannot be invoked when n is small, and we need to do something else when n < 30. When n < 30 and the underlying distribution is normal, we have a solution! 22

t Distribution The results on which large sample inferences are based introduces a new family of probability distributions called t distributions. When is the mean of a random sample of size n from a normal distribution with mean µ, the random variable has a probability distribution called a t Distribution with n 1 degrees of freedom (df). 23

Properties of t Distributions Figure below illustrates some members of the t-family 24

Properties of t Distributions Properties of t Distributions Let t ν denote the t distribution with ν df. 1. Each t ν curve is bell-shaped and centered at 0. 2. Each t ν curve is more spread out than the standard normal (z) curve. 3. As ν increases, the spread of the corresponding t ν curve decreases. 4. As ν, the sequence of t ν curves approaches the standard normal curve (so the z curve is the t curve with df = ). 25

Properties of t Distributions Let t α,ν = the number on the measurement axis for which the area under the t curve with ν df to the right of t α,ν is α;; t α,ν is called a t critical value. For example, t.05,6 is the t critical value that captures an upper-tail area of.05 under the t curve with 6 df 26

Tables of t Distributions The probabilities of t curves are found in a similar way as the normal curve. Example: obtain t.05,15 27

The t Confidence Interval Let and s be the sample mean and sample standard deviation computed from the results of a random sample from a normal population with mean µ. Then a 100(1 α)% t-confidence interval for the mean µ is or, more compactly: 28

Example cont d GPA measurements for 23 students have a histogram that looks like this: GPAs Frequency 0 1 2 3 4 5 6 7 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0 The sample mean is 3.146. The sample standard deviation is 0.308. Calculate a 90% CI for the same mean. GPA 29

Confidence Intervals for µ n >= 30 n < 30 Underlying normal distribution Underlying non-normal distribution 30

Confidence Intervals for µ n >= 30 n < 30 Underlying normal distribution Underlying non-normal distribution Weirdos 31

When the t-distribution doesn t apply When n < 30 and the underlying distribution is unknown, we have to: Make a specific assumption about the form of the population distribution and derive a CI based on that assumption. Use other methods (such as bootstrapping) to make reasonable confidence intervals. 32

A Confidence Interval for a Population Proportion Let p denote the proportion of successes in a population (e.g., individuals who graduated from college, computers that do not need warranty service, etc.). A random sample of n individuals is selected, and X is the number of successes in the sample. Then, X can be regarded as a Binomial rv with mean np and 33

A Confidence Interval for p Let p denote the proportion of successes in a population (e.g., individuals who graduated from college, computers that do not need warranty service, etc.). A random sample of n individuals is selected, and X is the number of successes in the sample. Then, X can be regarded as a Binomial rv with mean np and If both np 10 and n(1-p) 10, X has approximately a normal distribution. 34

A Confidence Interval for p The estimator of p is = X / n (the fraction of successes). has approximately a normal distribution, and Standardizing by subtracting p and dividing by then implies that And the CI is 35

A Confidence Interval for p The EPA considers indoor radon levels above 4 picocuries per liter (pci/l) of air to be high enough to warrant amelioration efforts. Tests in a sample of 200 homes found 127 (63.5%) of these sampled households to have indoor radon levels above 4 pci/l. Calculate the 99% confidence interval for the proportional of homes with indoor radon levels above 4 pci/l. 36

CIs for the Variance Let X 1, X 2,, X n be a random sample from a normal distribution with parameters µ and σ 2. Then the r.v. 2 has a chi-squared ( ) probability distribution with n 1 df. (In this class, we don t consider the case where the data is not normally distributed.) 37

The Chi-Squared Distribution Definition Let v be a positive integer. The random variable X has a chi-squared distribution with parameter v if the pdf of X The parameter is called the number of degrees of freedom (df) of X. The symbol χ 2 is often used in place of chi-squared. 38

CIs for the Variance The graphs of several Chi-square probability density functions are 39

CIs for the Variance Let X 1, X 2,, X n be a random sample from a normal distribution with parameters µ and σ 2. Then has a chi-squared (χ 2 ) probability distribution with n 1 df. 40

CIs for the Variance The chi-squared distribution is not symmetric, so these tables contain values of both for α near 0 and 1 41

CIs for the Variance As a consequence Or equivalently Thus we have a confidence interval for the variance σ 2. Taking square roots gives a CI for the standard deviation σ. 42

Example The data on breakdown voltage of electrically stressed circuits are: breakdown voltage is approximately normally distributed. s 2 = 137,324.3 n = 17 43

Confidence Intervals in R 44