The Two Sample T-test with One Variance Unknown

Size: px

Start display at page:

Download "The Two Sample T-test with One Variance Unknown"

Heather Bailey
6 years ago
Views:

1 The Two Sample T-test with One Variance Unknown Arnab Maity Department of Statistics, Texas A&M University, College Station TX , U.S.A. Michael Sherman Department of Statistics, Texas A&M University, College Station TX , U.S.A. Summary We consider the situation in two sample testing when one variance is assumed to be known while the other variance is considered unknown. This situation arises, for example, when one is interested in comparing a standard treatment with a new treatment. Although this situation occurs relatively infrequently, our example discusses the important tool of moment matching and makes the classic two sample Satterthwaite t-approximation transparent. Key words: Two sample t-test; Moments; Satterthwaite; Behrens-Fisher.

2 Introduction The two sample t-test is a mainstay in statistical practice and is introduced in most, if not all, introductory statistics classes. A common assumption for the two samples is that they come from normally distributed populations with different means. When the two variances are known this leads directly to a test statistic with a standard normal distribution. If the variances are unknown, under the assumption of equal variances the two sample test uses a pooled estimate of variance. In this case the resulting test statistic has an exact t-distribution. In the case where the two variances may differ the two variances are estimated separately. The resulting test statistic does not have an exact t-distribution. Nevertheless, it is common to approximate the distribution by a t-distribution with a random degrees of freedom based on the data. In this note we consider the case where one of the variances is assumed to be known and the other is treated as unknown. This situation arises in practice when a standard treatment is being compared to a new treatment. The standard treatment leads to a known variance, while the new treatment may have a different mean (which is our main interest) and may well have a different variance. We sketch the usual distribution theory for the known cases and use the methodology to derive an approximate t-test in the two sample case with one variance known and one unknown. An interesting observation is that the method of deriving Satterthwaite s approximate degrees of freedom for the unequal variance case works for the case when one variance is assumed known and the other one is estimated. We derive the approximate t-test in the case of one unknown variance, illustrate with an example, and carry out a small simulation to evaluate the accuracy of the approximate t-distribution. This example illustrates an important approximation technique of moment matching and makes the Satterthwaite approximation transparent. This note was motivated by a question from a graduate student in an introductory class for engineering students. Hypothesis Tests for a Difference in Means of Two Normal Distributions Let Y,..., Y n Normal(µ, σ) and Y,..., Y m Normal(µ, σ) where the notation means independent and identically distributed as. Throughout, we are interested in the

3 quantity µ µ and we want to test H : µ µ =, for a fixed difference of interest.. An Overview of Existing Tests: Both Variances are Known or Both are Unknown Assume that the sample size from the first population is n and from the second m. simplest case arises when variances of both the population, i.e., σ and σ are known. In this case, the z-statistic, Z = Ȳ Ȳ σ n + σ m is used. Under H, Z has a Normal(, ) distribution and the test can be conducted based on this fact. A more complicated situation occurs when the population variances are unknown. In this case the above z-test does not apply anymore since the variances need to be estimated. If we assume σ = σ = σ, it is reasonable to combine the two samples to estimate σ, rather than estimating separately for the two samples. The usual pooled estimator of σ is given by Spooled = (n )S + (m )S, n + m where S = n i= (Y i Ȳ) /(n ) and S = m i= (Y i Ȳ) /(m ). Then, the test statistic T = Ȳ Ȳ ( Spooled + ) n m is used and has an exact t-distribution with γ = n + m degrees of freedom, under H. A more realistic case arises when σ σ. This is known as the Behrens-Fisher problem. In this case the test statistic, The T = Ȳ Ȳ ( ) () S + S n m ( ) σ ( ) does not have an exact t-distribution. The difficulty lies in the fact that + σ S + S n m n m does not have an appropriate χ distribution. However, Satterthwaite (94, 946) proposed a method which approximates the exact distribution by a suitable χ distribution. Thus, using Satterthwaite s approximation, ( ) σ ( ) γ n + σ S m n + S χ γ m, approximately.

4 The value of γ is obtained by matching the variances on both sides of the above equation. ( ) σ { This gives γ = + σ (σ /n) + (σ /m) n m n m }. γ is estimated by γ, where we substitute S and S in place of σ and σ. Thus, T has an approximate t-distribution with γ degrees of freedom. We use an entirely analogous method in Section. in the case of one variance assumed to be known. See, e.g., Scheffé (97) for more background on the Behrens-Fisher problem. For further information on estimation of the degrees of freedom see, e.g., Ames and Webster (99) or Moser and Stevens (99). The two sample t-test is a special case of the general linear model Y = Xβ + ɛ, where ɛ is distributed as Normal[, V (θ)] and β is a column vector of fixed effect parameters. The test statistic used to test a linear hypothesis, H : l T β =, is t = lt β l TĈl. () where Ĉ is an estimate of the covariance matrix of β. The general Satterthwaite approximation for the denominator degrees of freedom is computed as γ gen = (lt Ĉl) g T Ag (3) where g is the gradient of l T Cl (with respect to θ), evaluated at θ and A is the asymptotic variance-covariance matrix of θ obtained from the second derivative matrix of the likelihood equations. In this general situation, t in () has an approximate t-distribution with γ gen degrees of freedom. See Giesbrecht and Burns (985) and SAS/STAT 9. User s Guide (4) for details.. The Variance of One Population is Known but the Other is Unknown (σ known but σ unknown) The situation, where the variance of one population is known but the other unknown, arises when a new treatment is compared to an old or standard treatment. The variance for the standard treatment may be considered known from historical data, while the variance caused by the new treatment may not be the same as the old one. In this case, pooling the samples may not be a good idea since the variances may be very different from each other. On the 3

5 other hand, using T as in () requires an estimate for the variance of the standard treatment which is assumed to be known. Since σ is known, a reasonable test statistic is: T 3 = Ȳ Ȳ σ n + S m. (4) ( ) σ ( ) However, T 3 does not have an exact t-distribution since + σ σ + S does not have n m n m an appropriate χ distribution. Hence, our goal is to find the appropriate degrees of freedom, γ 3, such that ( ) σ ( ) γ 3 n + σ σ m n + S χ γ m 3, approximately. (5) One of the ways to achieve this is to use the method of moments. The two sides in (5) clearly have the same expectations so we seek to match variances: ( ) σ γ3 ( ) n + σ σ var m n + S = γ 3 m ( ) γ 3 = { (σ /m) σ m } n + σ, m since var(s ) = σ 4 /(m ). Since σ is unknown, and not assumed to be the known value of σ, we estimate it by S and obtain γ 3 = ( ) σ ( Hence we can say that γ 3 n + σ σ + S m n m ( ) σ + S n m (S /m) m. (6) ) χ bγ 3, approximately. Hence, under H T 3 = Ȳ Ȳ σ n + S m t bγ3, approximately. Based on this we can now conduct level-α t-tests: Alternative hypothesis Rejection criterion H : µ µ T 3 > t α/,bγ3 or T 3 < t α/,bγ3 H : µ µ > H : µ µ < T 3 > t α,bγ3 T 3 < t α,bγ3 4

6 Note that γ 3 in (6) can be derived as a special case of γ gen in (3) by considering a model where Y = (Y,..., Y n, Y,..., Y m ) T, β = (µ, µ ) T, l = (, ) T, X is a (n+m) matrix such that its first n rows are (, ) and last m rows are (, ), and V is a (m + n) (m + n) diagonal matrix with first n diagonal entries being the known variance σ and the others being σ [mathematical details are available from the authors]. 3 A Data Example We consider data from problem. in the textbook of Montgomery and Runger (3). We quote the problem: A polymer is manufactured in a batch chemical process. Viscosity measurements are normally made on each batch, and long experience with the process has indicated that the variability of the process is fairly stable with σ =. Fifteen batch viscosity measurements are given as follows: 74, 78, 776, 76, 745, 759, 795, 756, 74, 74, 76, 749, 739, 747, 74. A process change is made which involves switching the type of catalyst involved in the process. Following the process change, eight batch viscosity measurements are taken: 735, 775, 79, 755, 783, 76, 738, 78. Assume that process variability is unaffected by the catalyst change. Find a 9% confidence interval on the difference in mean batch viscosity resulting from the process change. Note that in the problem it is assumed that switching the type of catalyst does not affect the variance. We are interested in a possible shift of the mean and hence it seems we should allow for a shift in the variances as well. Analysis: We proceed by assuming that σ = is known but that σ is unknown, i.e., that the process variability is affected by the catalyst change. Then we use the approximate t-test developed in Section. with degrees of freedom ( ) σ + S n m γ 3 = = 5.5. (S /m) m Also, n = 5 and m = 8. Hence a 9% CI is given by (Ȳ Ȳ t.5,bγ3 σ /n + S /m, Ȳ Ȳ + t.5,bγ3 σ /n + S /m) = (.6638, 9.338). 5

7 For comparison s sake we consider the assumption that the process variability is unaffected by the catalyst change. In this situation, σ = σ =. So a 9% CI for µ µ is given by (Ȳ Ȳ z.5 σ /n + σ/m, Ȳ Ȳ + z.5 σ /n + σ/m) = (.773, 7.773). We see that the two intervals are similar but that the first interval is wider than the second. This is due to the assumption that the variance is known in the post process change group. The rough similarity between the two intervals is compatible with the usual F-test for equality of variances. Testing: H : σ = vs H : σ, we find that p value =.678. It is interesting to see how the intervals compare in a case where the variances are more different. We take the same example as in previous section but change the y values to : Y,new = [76, 83, 73, 755, 8, 786, 74, 79], so that is sample variance is In this case, results are given below, Assumption 9% CI for µ µ H : µ µ = σ and σ are known ( 33.73, ) Reject H σ is known and σ is unknown ( 44.78, 5.478) Fail to reject H We see that the results are very different. If we perform a test H : µ µ =, based on the intervals, these two methods yield different conclusions at α =.. A simulation, based on the preceding data values, was conducted with n = 5 and m = 8 where we generate, data sets with Y,..., Y n Normal(µ = 75., σ = ) and Y,..., Y m Normal(µ = 75., σ = 4). Tests for H : µ µ = vs H : µ µ are performed at α =. for each of the data sets and type I errors are estimated. The test based on T 3 gives a type I error of. whereas tests based on T and T produce type I errors of.6 and., respectively. Next, we compare the empirical distribution of our test statistic with the approximate t-distribution. We take n = and m = 5 with σ = 4 and σ =. We generate, data sets and for each data set we compute the two sample t-statistic, T 3. It is interesting to compare the empirical distribution of T, T and T 3 with their corresponding hypothesized t-distributions. Kernel density estimates of T, T and T 3 (using an Epanechnikov kernel) are shown in Figures, and 3, respectively. It is clear the agreement is very close for T and T 3. However, 6

8 for T the discrepancy is clearly visible. We see that assumption of equal variance is not desirable in this case. Finally, we compare the power curves of the tests based on T and T 3. We consider two configurations: Setup. n = 5 and m = 5 with σ = 4 and σ = Setup. n = and m = 5 with σ = 4 and σ = We take µ = and µ = µ + (for between 5 and 5) and estimate power of the tests for different values of. Results for Setup and Setup, based on, simulations, are given in Figures 4 and 5, respectively. It is evident that if the size of the sample with known variance is small, the test based on T 3 produces larger power than that of T. This effect, however, diminishes as the sample size increases. We reiterate that this example comes from a query from a student. Its practical importance is limited by the fact that typically both variances are unknown and for this reason T is more generally applicable than T 3. References Ames, M.H. and Webster, J.T. (99). On Estimating Approximate Degrees of Freedom, The American Statistician, 45, Giesbrecht, F.G. and Burns, J.C. (985). Two-Stage Analysis Based on a Mixed Model: Large-Sample Asymptotic Theory and Small-Sample Simulation Results, Biometrics, 4, Montgomery, D.C. and Runger, G.C. (3). Applied Statistics and Probability for Engineers, 3rd edition, John Wiley and Sons. Moser, B.K. and Stevens, G.R. (99). Homogeneity of Variance in the Two-Sample Means Test, The American Statistician, 46, 9-. SAS Institute Inc. (4). SAS/STAT R 9. User s Guide. Cary, NC:SAS Institute Inc. Satterthwaite, F.E. (94). Synthesis of Variance, Psychometrika, 6, Satterthwaite, F.E. (946). An Approximate Distribution of Estimates of Variance Components, Biometrics Bulletin, 6, -4. 7

9 Scheffé, H. (97). Practical Solutions of the Behrens-Fisher Problem, Journal of the American Statistical Association, 65,

10 Figure : Plots of the density estimate of T (solid line) and actual t-density with γ = n+m degrees of freedom (dashed line) along with the critical values (vertical lines) for two sided 5% tests for H : µ µ =.

11 Figure : Plots of the density estimate of T (solid line) and actual t-density with γ degrees of freedom (dashed line) along with the critical values (vertical lines) for two sided 5% tests for H : µ µ =.

12 Figure 3: Plots of density estimate of T 3 (solid line) and actual t-density with γ 3 degrees of freedom (dashed line) along with the critical values (vertical lines) for two sided 5% tests for H : µ µ =.

13 .9 T3 T Figure 4: Power curves of the t-tests based on T 3 (solid line) and T (dashed line) when n = 5 and m = 5.

14 .9 T3 T Figure 5: Power curves of the t-tests based on T 3 (solid line) and T (dashed line) when n = and m = 5.

Statistical Methodology. A note on a two-sample T test with one variance unknown

Statistical Methodology. A note on a two-sample T test with one variance unknown Statistical Methodology 8 (0) 58 534 Contents lists available at SciVerse ScienceDirect Statistical Methodology journal homepage: www.elsevier.com/locate/stamet A note on a two-sample T test with one variance