χ 2 distributions and confidence intervals for population variance

Similar documents
MATH 3200 Exam 3 Dr. Syring

Determining Sample Size. Slide 1 ˆ ˆ. p q n E = z α / 2. (solve for n by algebra) n = E 2

CHAPTER 8. Confidence Interval Estimation Point and Interval Estimates

Confidence Intervals Introduction

Statistics for Business and Economics

5.3 Interval Estimation

Statistical Intervals (One sample) (Chs )

Estimating parameters 5.3 Confidence Intervals 5.4 Sample Variance

1 Introduction 1. 3 Confidence interval for proportion p 6

BIO5312 Biostatistics Lecture 5: Estimations

. 13. The maximum error (margin of error) of the estimate for μ (based on known σ) is:

Elementary Statistics Lecture 5

STAT Chapter 7: Confidence Intervals

STA215 Confidence Intervals for Proportions

Point Estimation. Principle of Unbiased Estimation. When choosing among several different estimators of θ, select one that is unbiased.

Exam 2 Spring 2015 Statistics for Applications 4/9/2015

Normal Probability Distributions

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

As you draw random samples of size n, as n increases, the sample means tend to be normally distributed.

Statistics 13 Elementary Statistics

σ 2 : ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics

Confidence Intervals. σ unknown, small samples The t-statistic /22

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 7.4-1

Chapter 8 Statistical Intervals for a Single Sample

Simple Random Sampling. Sampling Distribution

Unit 5: Sampling Distributions of Statistics

Unit 5: Sampling Distributions of Statistics

Chapter 7. Inferences about Population Variances

Statistical Intervals. Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

8.1 Estimation of the Mean and Proportion

GPCO 453: Quantitative Methods I Review: Hypothesis Testing

1 Small Sample CI for a Population Mean µ

If the distribution of a random variable x is approximately normal, then

ECO220Y Estimation: Confidence Interval Estimator for Sample Proportions Readings: Chapter 11 (skip 11.5)

Interval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems

Chapter 14 : Statistical Inference 1. Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same.

continuous rv Note for a legitimate pdf, we have f (x) 0 and f (x)dx = 1. For a continuous rv, P(X = c) = c f (x)dx = 0, hence

MgtOp S 215 Chapter 8 Dr. Ahn

μ: ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics

Chapter 7. Sampling Distributions

Two Populations Hypothesis Testing

Contents. 1 Introduction. Math 321 Chapter 5 Confidence Intervals. 1 Introduction 1

Chapter 7. Confidence Intervals and Sample Sizes. Definition. Definition. Definition. Definition. Confidence Interval : CI. Point Estimate.

LESSON 7 INTERVAL ESTIMATION SAMIE L.S. LY

STAT Chapter 6: Sampling Distributions

Experimental Design and Statistics - AGA47A

Chapter Seven: Confidence Intervals and Sample Size

Lecture 23. STAT 225 Introduction to Probability Models April 4, Whitney Huang Purdue University. Normal approximation to Binomial

Chapter 7. Sampling Distributions and the Central Limit Theorem

1. Covariance between two variables X and Y is denoted by Cov(X, Y) and defined by. Cov(X, Y ) = E(X E(X))(Y E(Y ))

Section The Sampling Distribution of a Sample Mean

CIVL Confidence Intervals

Chapter 5. Sampling Distributions

Confidence Intervals

Normal Distribution. Notes. Normal Distribution. Standard Normal. Sums of Normal Random Variables. Normal. approximation of Binomial.

Random Variables Handout. Xavier Vilà

Chapter 7 - Lecture 1 General concepts and criteria

Chapter 7 presents the beginning of inferential statistics. The two major activities of inferential statistics are

Lecture 10 - Confidence Intervals for Sample Means

University of California, Los Angeles Department of Statistics

Statistical Tables Compiled by Alan J. Terry

Hypothesis Tests: One Sample Mean Cal State Northridge Ψ320 Andrew Ainsworth PhD

Applied Statistics I

Chapter 7.2: Large-Sample Confidence Intervals for a Population Mean and Proportion. Instructor: Elvan Ceyhan

1. Statistical problems - a) Distribution is known. b) Distribution is unknown.

Estimation and Confidence Intervals

Parameter Estimation II

Business Statistics 41000: Probability 3

12/1/2017. Chapter. Copyright 2009 by The McGraw-Hill Companies, Inc. 8B-2

Commonly Used Distributions

Lecture 6: Confidence Intervals

Lecture 2 INTERVAL ESTIMATION II

Data Analysis and Statistical Methods Statistics 651

ECON 214 Elements of Statistics for Economists 2016/2017

1 Sampling Distributions

Data Analysis and Statistical Methods Statistics 651

1 Inferential Statistic

University of California, Los Angeles Department of Statistics. The central limit theorem The distribution of the sample mean

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals

Using the Central Limit Theorem It is important for you to understand when to use the CLT. If you are being asked to find the probability of the

AMS 7 Sampling Distributions, Central limit theorem, Confidence Intervals Lecture 4

ECO220Y Continuous Probability Distributions: Normal Readings: Chapter 9, section 9.10

Part V - Chance Variability

Chapter 7 Sampling Distributions and Point Estimation of Parameters

MATH 264 Problem Homework I

STA258 Analysis of Variance

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Statistics Class 15 3/21/2012

In a binomial experiment of n trials, where p = probability of success and q = probability of failure. mean variance standard deviation

Lecture 2. Probability Distributions Theophanis Tsandilas

Sampling and sampling distribution

The Bernoulli distribution

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing Examples

Module 4: Probability

Chapter 11: Inference for Distributions Inference for Means of a Population 11.2 Comparing Two Means

Section 7-2 Estimating a Population Proportion

Statistics 6 th Edition

LET us say we have a population drawn from some unknown probability distribution f(x) with some

6 Central Limit Theorem. (Chs 6.4, 6.5)

Class 16. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Transcription:

χ 2 distributions and confidence intervals for population variance Let Z be a standard Normal random variable, i.e., Z N(0, 1). Define Y = Z 2. Y is a non-negative random variable. Its distribution is known as Chi-square (χ 2 (1)) with one d.f. whose shape is depicted as follows. χ 2 (0.05; 1) denotes the upper 5 percentile of a chi-square distribution with 1 d.f. The sum of m independent squared standard Normal random variables has a Chi-square (χ 2 (m)) with m d.f., i.e., Y = Z 2 1 + Z2 2 + + Z2 m χ2 (m). The mean and variance of a χ 2 (m) random variable equals m and 2m respectively. Let Z1, Z 2 and Z 3 be three independent N(0, 1) random variables. Find c such that P (Z 2 1 + Z2 2 + Z2 3 c) = 0.01. 22S:39 class notes 53

Let X 1,, Xn be a random sample from a N(µ X, σ X 2 ) population. It can be shown that n i=1 (X i X) 2 /σ X 2 χ2 (n 1). Recall that the sample variance S 2 = n i=1 (X i X) 2 /(n 1). What is E(S 2 )? The above distribution result implies that a 95% confidence interval for σ 2 X equals [(n 1)S 2 /χ 2 (0.975; n 1), (n 1)S 2 /χ 2 (0.025; n 1)]. Why? Example. Speed of light. Student t distributions For the case of unknown σ X 2, it is estimated by the sample variance S2 = {(X 1 X) 2 +... + (Xn X) 2 }/(n 1). In this case, instead of standardizing the sample mean, we consider studentizing it: T = X µ is known, then X µ σ X / n S/ n. If σ X N(0, 1). The estimation of the population variance implies that T has a 22S:39 class notes 54

more variable distribution than the standard normal. In fact, its distribution is called a t-distribution with n 1 degree of freedoms. A t-distribution has zero mean and a symmetric pdf about 0. Its tails are thicker than the standard normal. A t-distribution becomes a standard normal when its df becomes very large. Example The upper 2.5 percentile of a t-distribution with 19 d.f. is denoted as t(0.025; 19). It equals When σ 2 X is estimated by S2, the 95 % C.I. of µ becomes ( X t(0.025; n 1) σ X, X + t(0.025; n 1) σ X) Example Speed of light. 22S:39 class notes 55

It can be shown that if Z is N(0, 1) and independent of W χ 2 (m), then Z/ W/m is a t-distribution with m d.f. Using this result and the result that for Normal populations, X is independent of S 2, we can prove that the studentized ratio X µ X S 2 /n has a t-distribution with n 1 d.f., which forms the basis of the above confidence intervals for the mean µ X. Comparing the variances of two populations Let X 1,..., Xn N(µ X, σ X 2 ) and Y 1,..., Y m N(µ Y, σ Y 2 ) be two independent random samples. We want to compare the two population variances and estimate the ratio σ X 2 /σ2 Y by S2 X /S2 Y. To construct C.I. (confidence interval), we need to introduce a new distribution. Let U and V be two independent χ 2 random variables with a and b df. Then the ratio F = U/a has an F (a, b) distribution. V/b Note that F (1 α; a, b) = 1/F (α; a, b). Applying this result, we see that (m 1)S2 Y /{(m 1)σ2 Y } (n 1)S X 2 /{(n 1)σ2 X } = S 2 Y σ2 X S X 2 σ2 Y F (m 22S:39 class notes 56

1, n 1). Therefore, with 95% probability, F (0.975; m 1, n 1) S2 Y σ2 X S 2 X σ2 Y F (0.025; m 1, n 1) After doing some algebra, the above is equivalent to S 2 X F (0.025; m 1, n 1)S 2 Y σ2 X σ 2 Y S2 X F (0.025; m 1, n 1) S Y 2 The latter is the formula for a 95% C.I. of σ2 X σ 2 Y Example. Speed of light.. Comparing two population means Let X 1,..., Xn N(µ X, σ X 2 ) and Y 1,..., Y m N(µ Y, σ Y 2 ) be two independent random samples. The difference µ X µ Y can be estimated by X Ȳ. By independence, σ 2 X Ȳ = σ2 X + σ 2 Ȳ = σ2 X /n + σ2 Y /m. If σ2 X and σ2 Y are unknown, they can be estimated respectively by S X 2 and S2 Y. There are three cases to consider in constructing a confidence interval of µ X µ Y. Case 1: σ X 2 and σ2 Y are known. (If n > 30 and m > 30, S2 X and S2 Y are effectively treated as if they were the true population variances.) Then a 95 % C.I. of µ X µ Y is given by ( X Ȳ 1.96σ X Ȳ, X Ȳ + 1.96σ X Ȳ ). 22S:39 class notes 57

More generally, the (1 α) 100% C.I. of µ X µ Y is given by ( X Ȳ z α/2 σ X Ȳ, X Ȳ + z α/2 σ X Ȳ ). Case 2: σ 2 X and σ2 Y are unknown, but assumed to be identical, i.e., σ2 X = σ2 Y = σ2. In this case, both S 2 X and S2 Y are estimators of σ2, which can be pooled to form a better estimator: S 2 = S 2 p = {(X 1 X) 2 +... + (Xn X) 2 } + {(Y 1 Ȳ )2 +... + (Ym Ȳ )2 }. n 1 + m 1 It can be shown that (n 1 + m 1)S 2 p /σ2 χ 2 (n + m 2). Estimate σ 2 X Ȳ = σ2 X + σ 2 Ȳ = σ2 X /n + σ2 Y /m. by σ X Ȳ = Sp 1/n + 1/m. Let the degree of freedom (df) be r = n + m 2. Then a 95 % C.I. of µ X µ Y is given by ( X Ȳ t(0.025; r) σ X Ȳ, X Ȳ + t(0.025; r) σ X Ȳ ). More generally, the (1 α) 100% C.I. of µ X µ Y is given by ( X Ȳ t(α/2; r) σ X Ȳ, X Ȳ + t(α/2; r) σ X Ȳ ). 22S:39 class notes 58

Example: Speed of light. Case 3. σ X 2 and σ2 Y are unknown, and they need not equal. In this case, we can t pool the two variance estimates. Instead, estimate σ 2 X Ȳ by σ X Ȳ = S2 X /n + S2 Y /m. The formula of the C. I. derived in the preceding case continue to hold except that the df is given by the formula: r = (S X 2 /n + S2 Y /m)2 1 n 1 (S2 X /n)2 + m 1 1 (S2 Y /m)2. Example: Speed of light. Estimating proportions Suppose that members of a population can be classified into two types: type I and non-type I. The population proportion of type I s is p. Let X = 1 if a randomly selected member from the population is of type I and 0 otherwise. The distribution of X is as follows: 22S:39 class notes 59

Then, µ X = p and σ X 2 = p(1 p). Hence, the population variance is a function of the mean. Let X 1,..., Xn be a random sample from the above populations. The sum of the X s equal the number of type I s in the sample. What is the distribution of X? The sample mean equals ˆp = X = {X 1 +... + Xn}/n is the sample proportion of type I in the sample. For large sample size, the CLT implies that X N(p, p(1 p)/n). When sample size is large as in most sample survey, we can estimate the variance by σˆp = ˆp(1 ˆp)/n. Hence, we have the approximate 95% C.I. of p: (ˆp 1.96 σˆp, ˆp + σˆp ). Example It is found that there are 40 defective parts in a sample of 100 parts from a lot of parts. Construct a 95% confidence interval for the true defective rate. Example The management introduced a new manufacturing procedure which is supposed to reduce the defective rate of the computer parts. A second sample was obtained which yields 30 defective parts in a random sample of 100 parts. Does the new manufacturing process represent an improvement over the old process? 22S:39 class notes 60