Mean Note: Weights were measured to the nearest 0.1 kg.

Purpose of the Sign Test Supplement 16A: Sign Test The sign test is a simple and versatile test that requires few assumptions. It is based on the binomial distribution. The test involves simply counting the number of positive or negative signs in a sequence of n signs. A common use is to compare two samples of paired data, as an alternative to the paired t test (Chapter 10) using only the signs of the differences. The sign test may also be used to look for rising or falling trends in time series data, instead of fitting a trend (Chapter 14). Illustration: Weight Loss Diet Table 16A.1 shows the 30-day weight loss (or gain) for 15 individuals who are enrolled in a diet program. The sign of each individual s difference d i is either positive, negative, or ero. TABLE 16A.1 Weight Loss (kg) for 14 Individuals Diet Individual After Before Difference Sign 1 83.1 87.6-4.5 2 88.8 88.8 0.0 0 3 81.3 79.3 2.0 + 4 107.7 113.6-5.9 5 73.8 80.9-7.1 6 81.0 79.3 1.7 + 7 85.6 82.8 2.8 + 8 91.6 94.1-2.5 9 91.0 97.0-6.0 10 87.7 87.8-0.1 11 89.6 85.3 4.3 + 12 106.8 103.9 2.9 + 13 89.2 92.7-3.5 14 77.2 84.4-7.2 15 95.3 97.3-2.0 Mean 88.65 90.32-1.67 Note: Weights were measured to the nearest 0.1 kg. Instead of using the paired t-test for mean differences, we only look at the sign (+, 0, ). Eliminating the one individual whose weight was unchanged, 9 of the remaining 14 individuals lost weight. Is this significantly different from what would be expected by chance? If weight losses/gains were random, the probability of a sign would be.50 so the hypotheses are: H 0 : π =.50 H 1 : π >.50 We choose a right-tailed test because we anticipate that the number of signs will be greater than by chance. Under the null hypothesis, the number of signs would follow a binomial distribution with n = 14, π =.50. From Appendix A or Excel, we obtain the binomial probabilities shown in Table 16A.2. Supplement 16A: Sign Test Page 1

TABLE 16A.2 Binomial Probabilities for n = 14, π =.50 x P(X = x) P(X <= x) P(X >= x) 0 0.0001 0.0001 1.0000 1 0.0009 0.0009 0.9999 2 0.0056 0.0065 0.9991 3 0.0222 0.0287 0.9935 4 0.0611 0.0898 0.9713 5 0.1222 0.2120 0.9102 6 0.1833 0.3953 0.7880 7 0.2095 0.6047 0.6047 8 0.1833 0.7880 0.3953 9 0.1222 0.9102 0.2120 10 0.0611 0.9713 0.0898 11 0.0222 0.9935 0.0287 12 0.0056 0.9991 0.0065 13 0.0009 0.9999 0.0009 14 0.0001 1.0000 0.0001 We can see that 9 or more signs would occur about 21.2 percent of the time by chance alone, so we are not convinced to reject H 0 at any of the usual levels of significance (e.g., α =.05). Based on the sign test, we cannot reject the possibility that π =.50, as illustrated in Figure 16A.1 FIGURE 16A.1 Binomial PDF for n = 14, π =.50 Supplement 16A: Sign Test Page 2

Interestingly, the paired t-test would come closer to finding a significant difference, as shown in the MegaStat output in Figure 16A.2. FIGURE 16A.2 MegaStat s Paired Difference t-test Hypothesis Test: Paired Observations 0.0000 hypothesied value 88.6467 mean After 90.3200 mean Before -1.6733 mean difference (After - Before) 3.9229 std. dev. 1.0129 std. error 15 n 14 df However, the paired t-test is not directly comparable to the sign test. First, the hypotheses are not the same (see Chapter 10) because the paired t-test refers to the mean difference in weight μ d : H 0 : μ d 0 H 1 : μ d < 0 Moreover, the paired t-test includes the observation with ero difference (15 observations instead of 14 useful observations in the sign test). The p-value for the t-test (.0604) is not small enough to reject H 0 at α =.05, although a rejection would have been possible at α =.10. While the sign test offers a less sensitive measure of weight loss than the paired t-test, the sign test has the advantage of being unaffected by outliers and non- normality of the population. Small Sample Sign Test -1.652 t.0604 p-value (one-tailed, lower) The sign test works the same way whether you are counting + or signs, although you will most often see it used to count + signs. The event of interest (be it + or ) is simply called a success. The test statistic is x = the observed number of successes in the sample of n items, excluding observations with no sign difference. The decision rule is: Left-tailed test: Reject H 0 : π.50 in favor of H 1 : π <.50 if P(X x) < α Right-tailed test: Reject H 0 : π.50 in favor of H 1 : π >.50 if P(X x) < α Two-tailed test: Reject H 0 : π =.50 in favor of H 1 : π.50 if P(X x) < α/2 or if P(X x) < α/2 Supplement 16A: Sign Test Page 3

Large Sample Sign Test When the sample sie is large, we can use the normal approximation to the binomial. To justify this approximation, the rule of thumb used in this textbook is nπ 10 or n 10/π which would require n 20 because n = 10/π = 10/.50 = 20. However, that rule is overly conservative when π =.50 because the binomial distribution is symmetric. In fact, when π =.50, we can safely use the normal approximation as long as n 10 and still have an acceptably small approximation error (.01 or less). For the normal approximation, we set the normal parameters μ and equal to their corresponding binomial parameters (see Chapter 6): (16A.1) n 0 (assumed normal mean) (16A.2) n 0(1 0) (assumed normal standard deviation) If there are x successes in the sample, the approximate test statistic is x n 0 (16A.3) n (1 ) (16A.4) or (general form) x.5n (simplified form) This simplification is possible because we are assuming a hypothesied value of π =.50, yielding these values for μ and : (16A.5) n(.5) =.5n (assumed normal mean) (16A.6) n(.5)(1.5) (assumed normal standard deviation) A further refinement would be to use a continuity correction (see Chapter 7 for details) by replacing x with x+0.5 (for a left-tailed test) or with x0.5 (for a right-tailed test). (16A.4a) (16A.4) x 0.5.5n (left-tailed test with continuity correction) x 0.5.5n (right-tailed test with continuity correction) This correction is not usually found in textbooks, but it can be important in some cases. Illustration: Stock Prices GE The sign test can be used to test for patterns over time. Table 16A.3 shows the adjusted closing price of stock for the General Electric Company over a period of 30 days. The sign of + indicates Supplement 16A: Sign Test Page 4

that the stock s price rose from one day to the next, while a sign of indicates that the stock s price fell. Overall, does the pattern of + s and s indicate a departure from chance? TABLE 16A.3 Adjusted Closing Price of General Electric Stock (30 days) Date Price Sign Date Price Sign 8-Aug 15.28 29-Aug 15.89 + 9-Aug 15.81 + 30-Aug 15.97 + 10-Aug 14.95 31-Aug 16.16 + 11-Aug 15.53 + 1-Sep 16.05 12-Aug 15.73 + 2-Sep 15.61 15-Aug 16.23 + 6-Sep 15.11 16-Aug 16.00 7-Sep 15.65 + 17-Aug 16.08 + 8-Sep 15.44 18-Aug 15.19 9-Sep 14.95 19-Aug 14.95 12-Sep 14.87 22-Aug 14.97 + 13-Sep 15.26 + 23-Aug 15.39 + 14-Sep 15.64 + 24-Aug 15.57 + 15-Sep 16.08 + 25-Aug 15.30 16-Sep 16.33 + 26-Aug 15.39 + 19-Sep 16.18 Source: http://finance.yahoo.com/q/hp?s=ge+historical+prices. Data are for Aug-Sep 2011. There are 17 + s and 12 s (note that we only have n = 29 because there is no sign for the first observation). If the pattern were random, then π =.50. For a two-tailed test, the hypotheses are: H 0 : π =.50 H 1 : π.50 Letting x = 17 successes (the number of + s) and using the continuity correction for a right-tailed test (because rejection, if it occurs, will be in the right tail) the test statistic is: x 0.5.5n 17 0.5.5(29) 2.0000 0.74278.5 29 2.6926 From Excel (or from Appendix C) we find that P(Z 0.74278) = 0.2288 so the two-tailed p- value is 0.4576 and we would not reject H 0 at α =.05. Alternatively, using α =.05, the two-tailed decision rule is to reject H 0 if 1.960 or if +1.960, as illustrated in Figure 13A.3. Because = 0.7428, we cannot reject H 0. The observed pattern does not differ significantly from a random pattern. The stock s price seems to be just as likely to rise as to fall. We could investigate further by fitting a moving average or a trend (see Chapter 14) using parametric tests that focus on numerical data. The sign test is not as sensitive as some parametric tests might be. Its advantage is that it ignores outliers and does not assume normality. GE Supplement 16A: Sign Test Page 5

FIGURE 16A.3 Decision Rule for Large Sample Sign test at α =.05 Large Sample Sign Test x.5n The test statistic for H 0 : π =.50 is. Follow the usual rules for a -test. The large sample approximation is justified when n 10 because we assume that π =.50 in this test. Section Exercises 13A.1 We would like to know whether SAT coaching helps improve scores on practice SAT tests. Refer to the sample shown below. (a) State the parameters of the binomial distribution for X = number of + s in the sample. (b) Use Excel or another computer tool to display the binomial distribution PDF and CDF for this small sample sign test. (c) Using α =.05 in a right-tailed test, does the sign test indicate that coaching is helpful? Explain. SAT Name After Before Diff Sign Bob 575 558 17 + Frank 498 502-4 Yvonne 573 550 23 + Stephen 579 553 26 + Gaspard 630 625 5 + Jaime 487 474 13 + Dennis 490 486 4 + Philip 491 484 7 + Consuela 492 472 20 + Frieda 493 474 19 + Dan 530 520 10 + Miriam 515 521-6 13A.2 Refer to the sequence of sign changes in the daily exchange rate between pounds sterling ( ) and euros ( ) over a 24-day period. In this data set + indicates an increase in the exchange rate ( / ) and conversely for. (a) Does this sample sie meet the criterion for a large-sample sign test using the normal approximation? Explain. (b) Calculate the Supplement 16A: Sign Test Page 6

normal parameters μ and for the large sample sign test. Show your ulations. (c) Calculate the test statistic. Show your work. (d) State a decision rule for a two-sided test at α =.05. (e) State the conclusion. + + + + + + + + + + + + + + + + + + Supplement 16A: Sign Test Page 7