Chapter 11: Inference for Distributions Inference for Means of a Population 11.2 Comparing Two Means

Chapter 11: Inference for Distributions 11.1 Inference for Means of a Population 11.2 Comparing Two Means 1

Population Standard Deviation In the previous chapter, we computed confidence intervals and performed significance tests for means using the assumption that we knew. This is almost never the case. In practice, we estimate when we take our sample from the population of interest. Estimating forces us to alter, ever so slightly, the calculations we did in Chapter 10, but the interpretation of confidence intervals and significance tests stays the same. 2

Conditions for Inference about a Mean (p. 617) SRS Observations from the population have a normal distribution with mean µ and standard deviation. Symmetrical and single-peaked essential. 3

Moving away from z In chapter 10, when we knew, we calculated a z- score for a particular mean as follows: z = x µ σ / n Now, we do not know, so we calculate a t-score, which provides somewhat of a cushio because we do not know, but must estimate it from the sample : t x µ = Standard error of the mean s / n 4

Turn to Table C (back cover) For a 95% confidence significance test (alpha = 0.05), two-sided, our z* is 1.960. That means that to reject the null, our calculated z must be: z >1.960 However, we must move up that column to a larger critical value if we estimate from the sample, which is almost always the case. 5

Table C If we had a sample of n=25, we would have n-1=24 degrees of freedom. p. 51 Our critical value would then be 2.064, so: t = x µ s / n > 2.064 What happens to our critical values as we move up the columns? (smaller sample sizes) What happens to our critical values as we move down the columns? (larger sample sizes) 6

Figure 11.1, p. 618 For fewer degrees of freedom (df), the particular t-distribution has more probability in the tails, making it more difficult to reject the null. Shouldt it be? 7

Practice Exercises 11.2 and 11.4, p. 620 8

One-sample t-procedures (p. 622) Confidence interval: x ± t * s n Significance test: t = x µ 0 s / n In both cases, is unknown. 9

Exercise 11.9, p. 628 Follow the Inference Toolbox. Step 2: Normality Normal probability plot and modified boxplot. Notes: Confidence intervals and p-values from t- procedures are not very sensitive to the lack of normality But outliers do cause problems! 10

Homework Reading through p. 642 Focus on matched pairs t-procedures and Example 11.4. Exercise 11.10 and 11.11, p. 628 Use a modified boxplot instead of a stemplot. 11

Robustness of the t-procedures A confidence interval or significance test is called robust if the C.I. or P-value does not change very much when the assumptions of the procedure are violated. 12

Using t-procedures one sample See Box, p. 636 SRS very important! n<15: do not use t-procedures if the data are clearly non-normal or if outliers are present. n at least 15: t-procedures can be used except in the presence of outliers or quite strong skewness. n at least 40: t-procedures can be used for even clearly skewed distributions. By CLT 13

Practice Problem: 11.33, p. 645 14

More Practice Problems 11.17 and 11.19, p. 638 15

Matched Pairs t Procedures Matched pairs designs: subjects are matched in pairs and each treatment is given to one subject in the pair (randomly). One type of matched pairs design is to have a group of subjects serve as their own pair-mate. Each subject then gets both treatments (randomize the order). Apply one-sample t-procedures to the observed differences. Example 11.4, p. 629 Note H 0 Look at Figure 11.7, p. 631 16

Example 11.4 17

Figure 11.7, p. 631: Computer Printouts for Example 11.4 18

Practice Exercise 11.13, p. 633 H H 0 a : µ : µ POST POST µ µ PRE PRE = µ = µ DIFF DIFF = > 0 0 19

Exercise 11.13 20

Power Probability of rejecting a false null hypothesis, given a specific value contained in the alternative hypothesis. Same idea as in Chapter 10. See Example 11.8, p. 639 21

Homework Exercises: 11.13, p. 633 11.31 and 11.32, pp. 644-645 Use Inference Toolbox Read section 11.2 22

Homework Reading: pp. 648-667 For Monday 11.1 Quiz tomorrow 23

11.2 Comparing Two Means The goal of two-sample inference problems is to compare the responses of two treatments or to compare the characteristics of two populations. We must have a separate sample from each treatment or each population. Unlike the matched-pairs designs. A two-sample problem can arise from a randomized comparative experiment that randomly divides subjects into two groups and exposes each group to a different treatment. 24

Exercises 11.37 and 11.38, p. 649 Single sample, matched pairs, or twosample? 25

Conditions for Significance Tests Comparing Two Means (p. 650) Two SRSs Samples are independent (matching violates this assumption). We measure the same variable for each sample. Both populations are normally distributed. Means and standard deviations of both are unknown. 26

Example 11.10, p. 650 Step 1: Population, parameter, hypotheses. Step 2: Write the hypotheses in both words and symbols. Normal probability plots, stemplots, boxplots Step 3: If the mean difference between the two groups is statistically different from 0, then we reject the null hypothesis. Therefore, we will need to understand the sampling distribution of: x1 x2 27

Sampling Distribution of x1 x2 The sampling distribution of interest here has a mean and variance as follows: Mean : µ µ 1 σ Variance : n 2 1 1 2 σ + n Since we do not know the population standard deviations, we find the standard error of the mean (the standard deviation of the sampling distribution) as follows: 2 2 2 2 s1 SE = + n 1 s n 2 2 2 28

Two-sample t-test The appropriate t-statistic is as follows. The degrees of freedom calculation is complex; we will use our calculators to provide this for us (the df are usually not whole numbers for two-sample tests). t = ( x 1 x2) ( µ 1 µ 2) s n 2 1 1 s + n 2 2 2 =0 for the H 0 :µ 1 =µ 2 29

Let s finish the example Example 11.11, p. 655 Step 4. There is some evidence, though not overwhelming, that there is a mean decrease in blood pressure for those taking calcium as compared to those taking a placebo. 30

Two-sample confidence interval for µ 1 -µ 2 Draw an SRS of size n 1 from a normal population with unknown mean µ 1, and draw an independent SRS of size n 2 from a normal population with unknown mean µ 2. The confidence interval for µ 1 -µ 2 is given by the following: ( s n 2 * x 1 x2) ± t 1 + 1 s n 2 2 2 Again, we need the df for t*, but we will let the calculator do that for us. 31

HW Exercise 11.41, p. 657 32

Using t-procedures for two-sample analyses See Box, p. 636 SRS very important! n 1 +n 2 <15: do not use t-procedures if the data are clearly non-normal or if outliers are present. n 1 +n 2 at least 15: t-procedures can be used except in the presence of outliers or strong skewness. n 1 +n 2 at least 40: t-procedures can be used for even clearly skewed distributions. By CLT 34

Practice Exercises 11.39-11.40, p. 657 Use the inference toolbox. Be sure to state your hypotheses in both symbols and words. 35

t-critical values (p. 654) Option 1: Use a calculator or computer for degrees of freedom and p-value. Option 2: Use the smaller of n 1-1 or n 2-1 for degrees of freedom. More conservative Option 1 is the desired method! 36

Pooled vs. Unpooled Variances (p. 666) Variances: Unequal Equal Procedure: Unpooled Pooled Use unequal/unpooled option with calculator/computer. Equal/pooled options were used mostly when we did the t-calculations by hand. 37

Practice Exercise 11.47, p. 665 38

More practice Exercises: 11.50, 11.53 pp. 668-671 11.68, p. 677 39