Tests for Paired Means using Effect Size

Similar documents
Two-Sample T-Tests using Effect Size

Two-Sample Z-Tests Assuming Equal Variance

Tests for Two Variances

Tests for One Variance

Superiority by a Margin Tests for the Ratio of Two Proportions

Non-Inferiority Tests for the Odds Ratio of Two Proportions

Non-Inferiority Tests for the Ratio of Two Proportions

Confidence Intervals for the Difference Between Two Means with Tolerance Probability

Tests for Intraclass Correlation

Tests for the Difference Between Two Linear Regression Intercepts

PASS Sample Size Software

Equivalence Tests for the Odds Ratio of Two Proportions

Tests for Two Means in a Cluster-Randomized Design

Non-Inferiority Tests for the Ratio of Two Means

Tests for Two Independent Sensitivities

Non-Inferiority Tests for Two Means in a 2x2 Cross-Over Design using Differences

Confidence Intervals for Paired Means with Tolerance Probability

Tests for Two Means in a Multicenter Randomized Design

Equivalence Tests for the Difference of Two Proportions in a Cluster- Randomized Design

Non-Inferiority Tests for the Difference Between Two Proportions

Tests for the Difference Between Two Poisson Rates in a Cluster-Randomized Design

Mixed Models Tests for the Slope Difference in a 3-Level Hierarchical Design with Random Slopes (Level-3 Randomization)

Conditional Power of One-Sample T-Tests

Tests for Two Exponential Means

Non-Inferiority Tests for the Ratio of Two Means in a 2x2 Cross-Over Design

Tests for Two ROC Curves

Conditional Power of Two Proportions Tests

Equivalence Tests for One Proportion

Tests for the Odds Ratio in a Matched Case-Control Design with a Binary X

Mendelian Randomization with a Binary Outcome

Equivalence Tests for Two Correlated Proportions

Group-Sequential Tests for Two Proportions

Tests for the Matched-Pair Difference of Two Event Rates in a Cluster- Randomized Design

Mendelian Randomization with a Continuous Outcome

Tests for Multiple Correlated Proportions (McNemar-Bowker Test of Symmetry)

Confidence Intervals for One-Sample Specificity

Equivalence Tests for the Ratio of Two Means in a Higher- Order Cross-Over Design

Confidence Intervals for Pearson s Correlation

Tolerance Intervals for Any Data (Nonparametric)

Conover Test of Variances (Simulation)

Two-Sample T-Test for Superiority by a Margin

Two-Sample T-Test for Non-Inferiority

Analysis of 2x2 Cross-Over Designs using T-Tests for Non-Inferiority

One-Sample Cure Model Tests

Confidence Intervals for an Exponential Lifetime Percentile

Point-Biserial and Biserial Correlations

One Proportion Superiority by a Margin Tests

Risk Analysis. å To change Benchmark tickers:

Two Populations Hypothesis Testing

Chapter 14 : Statistical Inference 1. Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same.

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals

Confidence Intervals for One Variance with Tolerance Probability

SAMPLE STANDARD DEVIATION(s) CHART UNDER THE ASSUMPTION OF MODERATENESS AND ITS PERFORMANCE ANALYSIS

GETTING STARTED. To OPEN MINITAB: Click Start>Programs>Minitab14>Minitab14 or Click Minitab 14 on your Desktop

Confidence Intervals for One Variance using Relative Error

LESSON 7 INTERVAL ESTIMATION SAMIE L.S. LY

Gamma Distribution Fitting

7.1 Comparing Two Population Means: Independent Sampling

Lecture 8: Single Sample t test

Lecture 35 Section Wed, Mar 26, 2008

R & R Study. Chapter 254. Introduction. Data Structure

Getting started with WinBUGS

7. For the table that follows, answer the following questions: x y 1-1/4 2-1/2 3-3/4 4

STA2601. Tutorial letter 105/2/2018. Applied Statistics II. Semester 2. Department of Statistics STA2601/105/2/2018 TRIAL EXAMINATION PAPER

Chapter 11: Inference for Distributions Inference for Means of a Population 11.2 Comparing Two Means

Previously, when making inferences about the population mean, μ, we were assuming the following simple conditions:

Sample Size for Assessing Agreement between Two Methods of Measurement by Bland Altman Method

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

1.017/1.010 Class 19 Analysis of Variance

Examples: Random Variables. Discrete and Continuous Random Variables. Probability Distributions

Data Distributions and Normality

σ 2 : ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics

CHAPTER 8. Confidence Interval Estimation Point and Interval Estimates

Financial Econometrics Review Session Notes 4

CS 361: Probability & Statistics

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Confidence Intervals for the Median and Other Percentiles

Confidence Intervals Introduction

Chapter 8 Statistical Intervals for a Single Sample

Tests for Two Correlations

Agresso User Manual Enquiries

Chapter 7. Inferences about Population Variances

Some Discrete Distribution Families

On Some Test Statistics for Testing the Population Skewness and Kurtosis: An Empirical Study

Chapter 7. Confidence Intervals and Sample Sizes. Definition. Definition. Definition. Definition. Confidence Interval : CI. Point Estimate.

Chapter 8 Estimation

Simulation. Decision Models

Determining Sample Size. Slide 1 ˆ ˆ. p q n E = z α / 2. (solve for n by algebra) n = E 2

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing Examples

Confidence Interval and Hypothesis Testing: Exercises and Solutions

Experimental Design and Statistics - AGA47A

Review: Population, sample, and sampling distributions

Hypothesis Tests: One Sample Mean Cal State Northridge Ψ320 Andrew Ainsworth PhD

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series

μ: ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics

Honor Code: By signing my name below, I pledge my honor that I have not violated the Booth Honor Code during this examination.

The Binomial and Geometric Distributions. Chapter 8

Exam 2 Spring 2015 Statistics for Applications 4/9/2015

Statistics 431 Spring 2007 P. Shaman. Preliminaries

You should already have a worksheet with the Basic Plus Plan details in it as well as another plan you have chosen from ehealthinsurance.com.

Transcription:

Chapter 417 Tests for Paired Means using Effect Size Introduction This procedure provides sample size and power calculations for a one- or two-sided paired t-test when the effect size is specified rather than the means and variance. The details of procedure are given in Cohen (1988). In this design, a single population of paired, normally distributed data is sampled and the mean difference is compared to zero by forming the difference scaled by the standard deviation of the differences. Test Assumptions When running a paired t-test, the basic assumptions are that the distribution of the paired differences is approximately normal and the subjects are independent. Test Procedure If we assume that μ and μ 0 represent the population mean and the specified test value respectively and the (unknown) standard deviation is σ, then the effect size is represented by d where dd = μμ 1 μμ 2 σσ The null hypothesis is H 0: d = 0 and the alternative hypothesis depends on the number of sides of the test: Two-Sided: H 1 : dd 0 or H 1 : μμ 1 μμ 2 0 Upper One-Sided: H 1 : dd > 0 or H 1 : μμ 1 μμ 2 > 0 Lower One-Sided: H 1 : dd < 0 or H 1 : μμ 1 μμ 2 < 0 A suitable Type I error probability (α) is chosen for the test, the data is collected, and a t-statistic is generated using the formula: x t = 1 x 2 sd N 2 417-1

This t-statistic follows a t distribution with N - 1 degrees of freedom. The null hypothesis is rejected in favor of the alternative if, for H 1 : dd 0 or H 1 : μμ 1 μμ 2 0 for H 1 : dd > 0 or H 1 : μμ 1 μμ 2 > 0 Or, for H 1 : dd < 0 or H 1 : μμ 1 μμ 2 < 0 t > t, t < t α / 2 or 1 α / 2 t > t 1 α, t < t α. Comparing the t-statistic to the cut-off t-value (as shown here) is equivalent to comparing the p-value to α. Power Calculation The power is calculated using the same formulation as in the Tests for Paired Means procedure with the modification that the σ used in that procedure is implicitly set equal to one. The Effect Size Suppose we assume that μ 1 μ2 represents the mean of the pairs of the population of interest. If the standard deviation of the pairs is σ, the effect size is represented by d where dd = μμ 1 μμ 2 σσ Cohen (1988) proposed the following interpretation of the d values. A d near 0.2 is a small effect, a d near 0.5 is a medium effect, and a d near 0.8 is a large effect. These values for small, medium, and large effects are popular in the social sciences. However, this convention is not as popular among the medical sciences since the scale of the effect is left unstated which makes interpretation difficult. Procedure Options This section describes the options that are specific to this procedure. These are located on the Design tab. For more information about the options of other tabs, go to the Procedure Window chapter. Design Tab The Design tab contains most of the parameters and options that you will be concerned with. Solve For Solve For This option specifies the parameter to be solved for from the other parameters. The parameters that may be selected are Power, Sample Size, Effect Size, and Alpha. In most situations, you will likely select either Power or Sample Size. The Solve For parameter is the parameter that will be displayed on the vertical axis of any plots that are shown. 417-2

Test Direction Alternative Hypothesis Specify whether the alternative hypothesis of the test is one-sided or two-sided. If a one-sided test is chosen, the hypothesis test direction is chosen based on whether the effect size is greater than or less than zero. Two-Sided Hypothesis Test H0: d = 0 vs. H1: d 0 One-Sided Hypothesis Tests Upper: H0: d 0 vs. H1: d > 0 Lower: H0: d 0 vs. H1: d < 0 Power and Alpha Power Power is the probability of rejecting the null hypothesis when it is false. Power is equal to 1 - Beta, so specifying power implicitly specifies beta. Beta is the probability obtaining a false negative with the statistical test. That is, it is the probability of accepting a false null hypothesis. The valid range is 0 to 1. Different disciplines have different standards for setting power. The most common choice is 0.90, but 0.80 is also popular. You can enter a single value, such as 0.90, or a series of values, such as.70.80.90, or.70 to.90 by.1. When a series of values is entered, PASS will generate a separate calculation result for each value of the series. Alpha Alpha is the probability of obtaining a false positive with the statistical test. That is, it is the probability of rejecting a true null hypothesis. The null hypothesis is usually that the parameters of interest (means, proportions, etc.) are equal. Since Alpha is a probability, it is bounded by 0 and 1. Commonly, it is between 0.001 and 0.10. Alpha is often set to 0.05 for two-sided tests and to 0.025 for one-sided tests. You can enter a single value, such as 0.05, or a series of values, such as.05.10.15, or.05 to.15 by.01. When a series of values is entered, PASS will generate a separate calculation result for each value of the series. Sample Size N (Sample Size) Enter a value for the sample size (N), the number of individuals in the study. You may enter a single value such as 42, range of values such as 10 to 100 by 10, or a list of values such as 10 30 80 90. 417-3

Effect Size d Enter one or more values for d, the effect size, that you wish to detect. This is a standardized difference between the mean and a specified value. The effect size is calculated using d = (μ1 μ2) / σ where μ1 is the mean assumed by the alternative hypothesis for the first paired variable, μ2 is the mean assumed by the alternative hypothesis for the second paired variable, and σ is your estimate of the population standard deviation of the differences. The value of d can be any non-zero value (positive or negative). However, it is usually between -3 and 3, excluding 0. You can enter a single value such as 0.5 or a series of values such as 0.2 0.5 0.8 or 0.2 to 0.8 by 0.1. When a series of values is entered, PASS will generate a separate calculation result for each value of the series. Cohen's Effect Size Table Cohen (1988) gave the following interpretation of d values that is still popular. Small d = 0.2 or 20% of σ Medium d = 0.5 or 50% of σ Large d = 0.8 or 80% of σ 417-4

Example 1 Finding the Sample Size Researchers wish to compare a two treatments for chronic pain. Subjects suffering from chronic pain are given on treatment on one night and the other treatment on the following night. The treatment to be given first is selected by a coin-toss. The subject s evaluation of pain intensity will be measured on a seven-point scale. The researchers would like to determine the sample sizes required to detect a small, medium, and large effect size with a two-sided, paired t-test when the power is 80% or 90% and the significance level is 0.05. Setup This section presents the values of each of the parameters needed to run this example. First, from the PASS Home window, load the procedure. You may then make the appropriate entries as listed below, or open Example 1 by going to the File menu and choosing Open Example Template. Option Value Design Tab Solve For... Sample Size Alternative Hypothesis... Two-Sided Power... 0.8 0.9 Alpha... 0.05 d... 0.2 0.5 0.8 Output Click the Calculate button to perform the calculations and generate the following output. Numeric Results Numeric Results for One-Sample T-Test Alternative Hypothesis: H1: d 0 Effect Target Actual Size Power Power N d Alpha 0.80 0.8017 199 0.20 0.050 0.90 0.9004 265 0.20 0.050 0.80 0.8078 34 0.50 0.050 0.90 0.9000 44 0.50 0.050 0.80 0.8213 15 0.80 0.050 0.90 0.9092 19 0.80 0.050 References Cohen, Jacob. 1988. Statistical Power Analysis for the Behavioral Sciences. Lawrence Erlbaum Associates. Hillsdale, New Jersey Julious, S. A. 2010. Sample Sizes for Clinical Trials. Chapman & Hall/CRC. Boca Raton, FL. Machin, D., Campbell, M., Tan, B. T., Tan, S. H. 2009. Sample Size Tables for Clinical Studies, 3rd Edition. Wiley-Blackwell. Ryan, Thomas P. 2013. Sample Size Determination and Power. John Wiley & Sons. New Jersey. Report Definitions Target Power is the desired power. May not be achieved because of integer N. Actual Power is the achieved power. Because N is an integer, this value is often (slightly) larger than the target power. N is the number of items sampled from the population. Effect Size: d = (μ1 μ2) / σ is the effect size. Cohen recommended Low = 0.2, Medium = 0.5, and High = 0.8. 417-5

Summary Statements A sample size of 199 data pairs achieves 80.2% power to reject the null hypothesis of zero effect size when the population effect size is 0.20 and the significance level (alpha) is 0.050 using a two-sided paired t-test. This report shows the values of each of the parameters, one scenario per row. Plots Section These plots show the relationship between effect size, power, and sample size. 417-6

Example 2 Validation using Another Procedure This procedure should give identical results to the Tests for Paired Means procedure when the value of σ there is set to one. We will use this fact to provide a validation problem for this procedure. If we run that procedure with power = 0.90, alpha = 0.05, Mean of Paired Differences = 0.5, SD = 1, and solve for sample size. The result is N = 44. Setup This section presents the values of each of the parameters needed to run this example. First, from the PASS Home window, load the procedure. You may then make the appropriate entries as listed below, or open Example 2 by going to the File menu and choosing Open Example Template. Option Value Design Tab Solve For... Sample Size Alternative Hypothesis... Two-Sided Power... 0.9 Alpha... 0.05 d... 0.5 Output Click the Calculate button to perform the calculations and generate the following output. Numeric Results Numeric Results for Paired T-Test Alternative Hypothesis: H1: d 0 Effect Target Actual Size Power Power N d Alpha 0.90 0.9000 44 0.50 0.050 This procedure also calculated N = 44, thus the procedure is validated. 417-7