An approximate sampling distribution for the t-ratio. Caution: comparing population means when σ 1 σ 2.

Similar documents
Learning Objectives for Ch. 7

Chapter 11: Inference for Distributions Inference for Means of a Population 11.2 Comparing Two Means

AP Statistics Chapter 6 - Random Variables

Previously, when making inferences about the population mean, μ, we were assuming the following simple conditions:

Statistical Models of Stocks and Bonds. Zachary D Easterling: Department of Economics. The University of Akron

Chapter 7. Inferences about Population Variances

1 Introduction 1. 3 Confidence interval for proportion p 6

Contents. 1 Introduction. Math 321 Chapter 5 Confidence Intervals. 1 Introduction 1

GETTING STARTED. To OPEN MINITAB: Click Start>Programs>Minitab14>Minitab14 or Click Minitab 14 on your Desktop

Statistical Intervals. Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Linear Regression with One Regressor

Interval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems

Section 7.4 Transforming and Combining Random Variables (DAY 1)

STAT Chapter 7: Confidence Intervals

Much of what appears here comes from ideas presented in the book:

Hypothesis Tests: One Sample Mean Cal State Northridge Ψ320 Andrew Ainsworth PhD

LESSON 7 INTERVAL ESTIMATION SAMIE L.S. LY

Distribution. Lecture 34 Section Fri, Oct 31, Hampden-Sydney College. Student s t Distribution. Robb T. Koether.

μ: ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics

12/1/2017. Chapter. Copyright 2009 by The McGraw-Hill Companies, Inc. 8B-2

χ 2 distributions and confidence intervals for population variance

Chapter 6.1 Confidence Intervals. Stat 226 Introduction to Business Statistics I. Chapter 6, Section 6.1

AP Stats Review. Mrs. Daniel Alonzo & Tracy Mourning Sr. High

Stat 139 Homework 2 Solutions, Fall 2016

Chapter Seven: Confidence Intervals and Sample Size

1 Inferential Statistic

Describing Data: One Quantitative Variable

Confidence Intervals. σ unknown, small samples The t-statistic /22

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 7.4-1

Data Analysis and Statistical Methods Statistics 651

Lecture 39 Section 11.5

Lecture 10 - Confidence Intervals for Sample Means

Two Populations Hypothesis Testing

CHAPTER 7 INTRODUCTION TO SAMPLING DISTRIBUTIONS

. 13. The maximum error (margin of error) of the estimate for μ (based on known σ) is:

Chapter 9: Sampling Distributions

Uniform Probability Distribution. Continuous Random Variables &

STA258 Analysis of Variance

CHAPTER 6 Random Variables

A) The first quartile B) The Median C) The third quartile D) None of the previous. 2. [3] If P (A) =.8, P (B) =.7, and P (A B) =.

As you draw random samples of size n, as n increases, the sample means tend to be normally distributed.

Chapter 6: Random Variables

Chapter 6: Random Variables

Lecture 9 - Sampling Distributions and the CLT

If the distribution of a random variable x is approximately normal, then

5.1 Mean, Median, & Mode

CHAPTER 8. Confidence Interval Estimation Point and Interval Estimates

Experimental Design and Statistics - AGA47A

Multiple Regression. Review of Regression with One Predictor

Estimating parameters 5.3 Confidence Intervals 5.4 Sample Variance

MgtOp S 215 Chapter 8 Dr. Ahn

Statistical Intervals (One sample) (Chs )

Data Analysis and Statistical Methods Statistics 651

SAMPLING DISTRIBUTIONS. Chapter 7

Confidence Intervals

9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives

Estimation and Confidence Intervals

Hypotesis testing: Two samples (Chapter 8)

Confidence Intervals Introduction

Chapter 8 Statistical Intervals for a Single Sample

STAT Chapter 6: Sampling Distributions

Two Way ANOVA in R Solutions

Making Sense of Cents

STAT 113 Variability

Chapter 8 Estimation

Copyright 2011 Pearson Education, Inc. Publishing as Addison-Wesley.

Chapter 7. Confidence Intervals and Sample Sizes. Definition. Definition. Definition. Definition. Confidence Interval : CI. Point Estimate.

6.3: The Binomial Model

Sampling Distribution

Design of Engineering Experiments Part 9 Experiments with Random Factors

Analysis of Variance in Matrix form

Stat 213: Intro to Statistics 9 Central Limit Theorem

CHAPTER 6 Random Variables

THE UNIVERSITY OF TEXAS AT AUSTIN Department of Information, Risk, and Operations Management

STATISTICS - CLUTCH CH.9: SAMPLING DISTRIBUTIONS: MEAN.

ECO220Y Estimation: Confidence Interval Estimator for Sample Proportions Readings: Chapter 11 (skip 11.5)

STA218 Analysis of Variance

Confidence Intervals and Sample Size

AP Stats. Review. Mrs. Daniel Alonzo & Tracy Mourning Sr. High

Determining Sample Size. Slide 1 ˆ ˆ. p q n E = z α / 2. (solve for n by algebra) n = E 2

σ 2 : ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics

Non-Inferiority Tests for Two Means in a 2x2 Cross-Over Design using Differences

HOMEWORK: Due Mon 11/8, Chapter 9: #15, 25, 37, 44

Session 178 TS, Stats for Health Actuaries. Moderator: Ian G. Duncan, FSA, FCA, FCIA, FIA, MAAA. Presenter: Joan C. Barrett, FSA, MAAA

Final/Exam #3 Form B - Statistics 211 (Fall 1999)

The Two-Sample Independent Sample t Test

Normal Probability Distributions

7.1 Comparing Two Population Means: Independent Sampling

σ e, which will be large when prediction errors are Linear regression model

Introduction to Statistics I

The Assumption(s) of Normality

STA Module 3B Discrete Random Variables

Tutorial 6. Sampling Distribution. ENGG2450A Tutors. 27 February The Chinese University of Hong Kong 1/6

8.3 CI for μ, σ NOT known (old 8.4)

Confidence Intervals for the Difference Between Two Means with Tolerance Probability

Statistics and Probability

22S:105 Statistical Methods and Computing. Two independent sample problems. Goal of inference: to compare the characteristics of two different

STAT 509: Statistics for Engineers Dr. Dewei Wang. Copyright 2014 John Wiley & Sons, Inc. All rights reserved.

Normal Curves & Sampling Distributions

Statistical Methodology. A note on a two-sample T test with one variance unknown

Transcription:

Stat 529 (Winter 2011) Non-pooled t procedures (The Welch test) Reading: Section 4.3.2 The sampling distribution of Y 1 Y 2. An approximate sampling distribution for the t-ratio. The Sri Lankan analysis. Pooled or non-pooled? Caution: comparing population means when σ 1 σ 2. Performing a good analysis. Preliminary tests for normality and equal variances The modeling game plan. 1

The non-pooled t procedures The non-pooled t procedures or Welch test are available for performing statistical inference on µ 1 µ 2 when σ 1 σ 2. We assume 1. Y 11,..., Y 1n1 form a random sample from some population. Y 21,..., Y 2n2 form a random sample from some other population. 2. The two samples are independent of one another. 3. The population distributions are normal with unknown means µ 1 and µ 2, and with unknown standard deviations σ 1 and σ 2. As for the pooled setting, Y 1 Y 2 is an estimate of µ 1 µ 2. 2

The sampling distribution of Y 1 Y 2 From earlier we know that Y 1 Y 2 has a normal distribution with σ1 2 mean µ 1 µ 2 and S.D. + σ2 2. n 1 n 2 When σ 2 1 σ 2 2, the t-ratio, t = Y 1 Y 2 (µ 1 µ 2 ) s 2 1 + s2 2 n 1 n 2 does not exactly have a t distribution. 3

An approximate sampling distribution for Y 1 Y 2 However, the t-ratio does does have an approximate t distribution with df = ( ) s 2 2 1 + s2 2 n 1 n 2 (s 2 1/n 1 ) 2 (n 1 1) + (s2 2/n 2 ) 2 (n 2 1) degrees of freedom (this is called the Satterthwaite approximation). We use this as the basis of tests and confidence intervals for statistical inference on µ 1 µ 2. See the two sample non-pooled t/welch procedures handout. 4

The Sri Lankan analysis For the log 10 data, let us test H 0 : µ 1 = µ 2 versus H a : µ 1 µ 2, where µ 1 is the population mean log 10 zinc content found in rural Sri Lankan hair, and µ 2 is the population mean log 10 zinc content found in urban Sri Lankan hair. Here is part of the summaries of the log 10 data again: Descriptive Statistics: log10(zinc content) Population N N* Mean SE Mean StDev rural 15 0 2.7447 0.0888 0.3438 urban 11 0 2.966 0.102 0.339 5

The Sri Lankan analysis, continued 6

The Sri Lankan analysis, continued 7

Pooled or non-pooled? We use Stat Basic Statistics 2-Sample t in MINITAB to compare pooled and non-pooled tests. When we do not select Assume equal variance: Two-Sample T-Test and CI: log10(zinc content), Population Difference = mu (rural) - mu (urban) Estimate for difference: -0.221543 95% CI for difference: (-0.503048, 0.059962) T-Test of difference = 0 (vs not =): T-Value = -1.64 P-Value = 0.117 DF = 21 When we select Assume equal variance: Two-Sample T-Test and CI: log10(zinc content), Population Difference = mu (rural) - mu (urban) Estimate for difference: -0.221543 95% CI for difference: (-0.501565, 0.058479) T-Test of difference = 0 (vs not =): T-Value = -1.63 P-Value = 0.116 DF = 24 Both use Pooled StDev = 0.3418 8

Pooled or non-pooled, continued Advantages of pooled: Pooled test is slightly more powerful than non-pooled when σ 1 = σ 2 (especially when n 1 is much smaller than n 2, or vice versa). Model for pooled procedures is commonly used in other statistical procedures such as analysis of variance (ANOVA) or regression. Advantages of non-pooled: Robust to departures from the σ 1 = σ 2 assumption. A recommendation: 9

Caution: comparing population means when σ 1 σ 2 If two distributions have the same shape and spread then a difference of means is an adequate summary of the difference in the distributions. When the spread is different µ 1 µ 2 may be a bad summary of the difference in the distributions. 10

Performing a good analysis Bring all your subject matter knowledge to the table when you perform an analysis. Should σ 1 = σ 2? Should you expect populations to be normal? Or, are they normal after some transformation? Bring statistical knowledge to the table too! What weaknesses does the statistical procedure you are using have? Is it robust to the assumptions? Is it resistant? Does your experiment/study suggest weaknesses that may invalidate the analysis/model? 11

Preliminary tests for normality and equal variances A number of the t-procedures assume that the data are drawn from a normal population. The pooled t-procedures assume equality of variances. There are a number of tests in the literature for testing normality and equality of variance (i.e., σ 1 = σ 2 ). For example in testing for equality (homogeneity) of variance: Bartlett s test relies heavily on the assumption of normality even with large samples, the S.E. used in the test is wrong if populations are not normal. Levene s test is more robust to departures from normality. Some advice: 12

The modeling game plan 1. Look at your data (via graphical/numerical summaries). 2. Transform as necessary (for analysis or interpretation). 3. Analyze with or without outliers. 4. Report conclusions. NOTE: Outliers are local features of the data. A transformation (a global feature) can remove the problem of outliers. Always do 2. before 3. 13