Analysis of 2x2 Cross-Over Designs using T-Tests for Equivalence

Chapter 37 Analysis of x Cross-Over Designs using -ests for Equivalence Introuction his proceure analyzes ata from a two-treatment, two-perio (x) cross-over esign where the goal is to emonstrate equivalence between a treatment an a control or reference. he response is assume to be a continuous ranom variable that follows the normal istribution. When the normality assumption is suspect, the nonparametric Mann-Whitney U (or Wilcoxon Rank-Sum) est may be employe. In the two-perio cross-over esign, subjects are ranomly assigne to one of two groups. One group receives treatment R followe by treatment. he other group receives treatment followe by treatment R. hus, the response is measure at least twice on each subject. Cross-over esigns are use when the treatments alleviate a conition, rather than effect a cure. After the response to one treatment is measure, the treatment is remove an the subject is allowe to return to a baseline response level. Next, the response to a secon treatment is measure. Hence, each subject is measure twice, once with each treatment. Examples of the situations that might use a cross-over esign are the comparison of anti-inflammatory rugs in arthritis an the comparison of hypotensive agents in essential hypertension. In both of these cases, symptoms are expecte to return to their usual baseline level shortly after the treatment is stoppe. Equivalence Cross-over esigns are popular in the assessment of equivalence. In this case, the effectiveness of a new treatment formulation (rug) is to be compare against the effectiveness of the currently use (reference) formulation. When showing equivalence, it is not necessary to show that the new treatment is better than the current treatment. Rather, the new treatment nee only be shown to be as goo as the reference so that it can be use in its place. Avantages of Cross-Over Designs A comparison of treatments on the same subject is expecte to be more precise. he increase precision often translates into a smaller sample size. Also, patient enrollment into the stuy may be easier because each patient will receive both treatments. 37-

Disavantages of Cross-Over Designs he statistical analysis of a cross-over experiment is more complex than a parallel-group experiment an requires aitional assumptions. It may be ifficult to separate the treatment effect from the time effect an the carry-over effect of the previous treatment. he esign cannot be use when the treatment (or the measurement of the response) alters the subject permanently. Hence, it cannot be use to compare treatments that are intene to effect a cure. Because subjects must be measure at least twice, it may be more ifficult to keep patients enrolle in the stuy. It is arguably simpler to measure a subject once than to obtain their measurement twice. his is particularly true when the measurement process is painful, uncomfortable, embarrassing, or time consuming. echnical Details Suppose you want to evaluate the non-inferiority of a treatment,, as compare to a control or reference, R, using ata on subjects in a x cross-over esign, where a perio effect may be present. his proceure allows you to perform this type of analysis. Cross-Over Analysis In the iscussion that follows, we summarize the presentation of Chow an Liu (999). We suggest that you review their book for a more etaile presentation. he general linear moel for the stanar x cross-over esign is Y µ + S + P + F( ) + C( ) + e,, ijk ik j j k j k ijk where i represents a subject ( to n k ), j represents the perio ( or ), an k represents the sequence ( or ). he S ik represent the ranom effects of the subjects. he P j represent the effects of the two perios. he F( j, k ) represent the effects of the two formulations (treatments). In the case of the x cross-over esign F ( j k ), FR if k j F if k j where the subscripts R an represent the reference an treatment formulations, respectively. C j he (, k ) represent the carry-over effects. In the case of the x cross-over esign C ( j, k ) CR if j, k C if j, k 0 otherwise where the subscripts R an represent the reference an treatment formulations, respectively. Assuming that the average effect of the subjects is zero, the four means from the x cross-over esign can be summarize using the following table. where P + P, F + F 0, an C + C 0. 0 R Sequence Perio Perio ( R ) µ µ + P + F µ µ + P + F + C ( R) µ µ + P + F µ µ + P + F + C R R R R 37-

reatment Effect wo-sample -est for reatment Effect he presence of a treatment (rug) effect can be stuie by testing whether is calculate as follows Fˆ M ˆ σ + F F M using a t test. his test R where F k n k k i n ik n σ ik k ik k ( ) n n k i Y Y ik ik ( ) he null hypothesis of no rug effect is rejecte at the α significance level if > tα /, + A 00( α )% confience interval for F F F is given by R ( tα /, + ) F ± σ +.. Mann-Whitney U (or Wilcoxon Rank-Sum) est for reatment Effect Senn (00) pages 3-4 escribes Koch s aaptation of the Wilcoxon-Mann-Whitney rank sum test that tests treatment effects in the presence of perio effects. he test is base on the perio ifferences an assumes that there are no carryover effects. Koch s metho calculates the ranks of the perio ifferences for all subjects in the trial an then uses the Mann-Whitney U (or Wilcoxon Rank-Sum) est to analyze these ifferences between the two sequence groups. he Mann-Whitney U (or Wilcoxon Rank-Sum) est is escribe in etail in the wo- Sample -est chapter of the ocumentation. Carryover Effect he x cross-over esign shoul only be use when there is no carryover effect from one perio to the next. he presence of a carryover effect can be stuie by testing whether C CR 0 using a t test. his test is calculate as follows c σ u C + 37-3

where C U U U k n k k i U n ik n σ u ik k k ( ) n n k i U Y + Y ik ik i k ( U U ) he null hypothesis of no carryover effect is rejecte at the α significance level if > tα /, +. c A 00( α )% confience interval for C C C is given by R ( tα /, + ) C ± σ u +. Perio Effect he presence of a perio effect can be stuie by testing whether P P 0 using a t test. his test is calculate as follows where P O O O O n σ ik k ik k ( ) n n k i Y Y ik ik ( ) P σ P + he null hypothesis of no rug effect is rejecte at the α significance level if > tα /, +. P A 00( α )% confience interval for P P P is given by ( tα /, + ) P ± σ +. 37-4

Bioequivalence he t test of formulations (treatments) may be thought of as a preliminary assessment of bioequivalence. However, this t test investigates whether the two treatments are ifferent. It oes not assess whether the two treatments are the same bioequivalent. hat is, failure to reject the hypothesis of equal means oes not imply bioequivalence. In orer to establish bioequivalence, ifferent statistical tests must be use. Before iscussing these tests, it is important to unerstan that, unlike most statistical hypothesis tests, when testing bioequivalence, you want to establish that the response to the two treatments is the same. Hence, the null hypothesis is that the mean responses are ifferent an the alternative hypothesis is that the mean responses are equal. his is just the opposite from the usual t test. his is why bioequivalence testing requires the special statistical techniques iscusse here. When using a cross-over esign to test for bioequivalence, a washout perio between the first an secon perios must be use that is long enough to eliminate the resiual effects of the first treatment from the response to the secon treatment. Because of this washout perio, there is no carryover effect. Without a carryover effect, the general linear moel reuces to Yijk µ + Sik + Pj + F( ) + e j, k ijk here are many types of bioequivalence. he x cross-over esign is use to assess average bioequivalence. Remember that average bioequivalence is a statement about the population average. It oes not make reference to the variability in responses to the two treatments. he 99 FDA guiance uses the ± 0 rule which allows an average response to a test formulation to vary up to 0% from the average response of the reference formulation. his rule requires that ratio of the two averages µ / µ R be between 0.8 an. (80% to 0%). Another way of stating this is that the µ is within 0% of µ R. he FDA requires that the significance level be 0.0 or less. Several methos have been propose to test for bioequivalence. Although the program provies several methos, you shoul select only the one that is most appropriate for your work. Confience Interval Approach he confience interval approach, first suggeste by Westlake (98), states that bioequivalence may be conclue if a ( α ) 00% confience interval for the ifference µ µ R or ratio µ / µ R is within acceptance limits (α is usually set to 0.05). If the ± 0 rule is use, this means that the confience interval for the ifference must be between -0. an 0.. Likewise, the confience interval for the ratio must be between 0.8 an. (or 80% an 0%). Several methos have been suggeste for computing the above confience interval. he program provies the results for five of these. Perhaps the best of the five is the one base on Fieller s heorem since it makes the fewest, an most general, assumptions about the istribution of the responses. Classic (Shortest) Confience Interval of the Difference L ( Y YR ) ( t ) α, + σ + ( ) ( α ) + U Y YR + t, σ + 37-5

Classic (Shortest) Confience Interval of the Ratio A confience interval for the ratio may be calculate from the confience interval on the ifference using the formula ( / ) ( / ) L L Y R + 00% U U Y R + 00% Westlake s Symmetric Confience Interval of the Difference First, compute values of k an k so that Next, compute using k α n + n t k σ + k ( Y Y ) R k σ + + ( Y Y ) R Finally, conclue bioequivalence if < 0. µ R Westlake s Symmetric Confience Interval of the Ratio A confience interval for the ratio may be calculate from the confience interval on the ifference using the formula ( / Y R ) ( / Y R ) L 4 + 00% U 4 + 00% Confience Interval of the Ratio Base on Fieller s heorem Both the classic an Westlake s confience interval for the ratio o not take into account the variability of Y R an the correlation between Y R an Y YR. Locke (984) provies formulas using Fieller s theorem that oes take into account the variability of Y R. his confience interval is popular not only because it takes into account the variability of Y R, but also the intersubject variability. Also, it only assumes that the ata are normal, but not that the group variances are equal as o the other two approaches. he ( α ) 00% confience limits for δ µ / µ are the roots of the quaratic equation R ( Y δyr ) ( tα, n + n ) ω( S δsr δ SRR) + 0 37-6

where ω + 4 S S S RR R ( n n ) ( n n ) ( n n ) n ( Yi Y ) + ( Yi Y ) i n n i ( Yi Y ) + ( Yi Y ) i n n i ( Yi Y )( Yi Y ) + ( Yi Y )( Yi Y ) i Aitionally, in orer for the roots of the quaratic equation to be finite positive real numbers, the above values must obey the conitions an YR ωs Y ωs RR n i > tα n + n, > tα n + n, Interval Hypotheses esting Approach Schuirmann (98) introuce the iea of using an interval hypothesis to test for average bioequivalence using the following null an alternative hypotheses H : µ µ θ 0 R L or µ µ θ R U H a :θ L < µ µ R < θ U where θ L an θ U are limits selecte to insure bioequivalence. Often these limits are set at 0% of the reference mean. hese hypotheses can be rearrange into two one-sie hypotheses as follows H : µ µ θ versus H : µ µ > θ 0 R L a R L H : µ µ θ versus H : µ µ < θ 0 R U a R U he first hypothesis test whether the treatment response is too low an the secon tests whether the treatment response is too high. If both null hypotheses are rejecte, you conclue that the treatment rug is bioequivalent to the reference rug. 37-7

Schuirmann s wo One-Sie ests Proceure Schuirmann s proceure is to conuct two one-sie tests, each at a significance level of α. If both tests are rejecte, the conclusion of bioequivalence is mae at the α significance level. hat is, you conclue that µ an µ R are average equivalent at the α significance level if L ( Y Y ) σ R L θ + t > α, n + n an U ( Y Y ) σ R U θ + > tα, + Wilcoxon-Mann-Whitney wo One-Sie ests Proceure When the normality assumption is suspect, you can use the nonparametric version of Schuirmann s proceure, known as the Wilcoxon-Mann-Whitney two one-sie tests proceure. his rather complicate proceure is escribe on pages 0-5 of Chow an Liu (999) an we will not repeat their presentation here. Anerson an Hauck s est Unlike Schuirman s test, Anerson an Hauck (983) propose a single proceure that evaluates the null hypothesis of inequivalence versus the alternative hypothesis of equivalence. he significance level of the Anerson an Hauck test is given by where ( ) x n + n Pr x t t δ t AH θu θl σ + ( Y YR ) ( θu + θl) σ + / ( tah ) ( tah ) α Pr δ Pr δ Data Structure he ata for a cross-over esign is entere into three variables. he first variable contains the sequence number, the secon variable contains the response in the first perio, an the thir variable contains the response in the secon perio. Note that each row of ata represents the complete response for a single subject. Chow an Liu (999) give the following ata on page 73. hese ata are containe in the ataset calle ChowLiu73. 37-8

ChowLiu73 ataset Sequence Perio Perio 74.675 73.675 96.400 93.50 0.950 0.5 79.050 69.450 79.050 69.05 85.950 68.700 69.75 59.45 86.75 76.5.675 4.875 99.55 6.50 89.45 64.75 55.75 74.575 74.85 37.350 86.875 5.95 8.675 7.75 9.700 77.500 50.450 7.875 66.5 94.05.450 4.975 99.075 85.5 86.350 95.95 49.95 67.00 4.700 59.45 Proceure Options his section escribes the options available in this proceure. o fin out more about using a proceure, turn to the Proceures chapter. Following is a list of the proceure s options. Variables ab he options on this panel specify which variables to use. Sequence Variable Sequence Group Variable Specify the variable containing the sequence number. he values in this column shoul be either (for the first sequence) or (for the secon sequence). Perio Variables Perio Variable Specify the variable containing the responses for the first perio of the cross-over trial, one subject per row. Perio Variable Specify the variable containing the responses for the secon perio of the cross-over trial, one subject per row. 37-9

reatment Labels Label (Reference) his is the one-letter label given to the reference treatment. his ientifies the treatment that occurs first in sequence. Common choices are R, C, or A. Label (reatment) his is the one-letter label given to the new treatment of interest. his ientifies the treatment that occurs secon in sequence. Common choices are or B. Equivalence est Options Upper Equivalence Limit Specify the upper boun on the range of equivalence. Differences (μ - μ) greater than this amount are not bioequivalent. Differences less than this may be bioequivalent if they are also greater than the lower limit. If the % box is checke, this value is assume to be a percentage of the reference mean. If the % box is not checke, this value is assume to be the value of the ifference. Lower Equivalence Limit Specify the lower boun on the range of equivalence. Differences (μ - μ) less than this amount are not bioequivalent. Differences greater than this may be bioequivalent if they are also less than the upper limit. Note that this shoul be a negative number. If the % box is checke, this value is assume to be a percentage of the reference mean. If the % box is not checke, this value is assume to be the value of the ifference. If you want symmetric limits, enter -UPPER LIMI here an the negative of the Upper Equivalence Limit will be use. Reports ab he options on this panel control the reports an plots. Descriptive Statistics an Confience Intervals Confience Level his is the confience level that is use for the confience intervals in the Cross-Over Effects an Means Summary Report. ypical confience levels are 90%, 95%, an 99%, with 95% being the most common. Cross-Over Effects an Means Summary Report Check this option to show the report containing parameters estimates an confience intervals for reatment, Perio, an Carryover effects. Cross-Over Analysis Detail Report Check this option to show a report containing the estimate effects, means, stanar eviations, an stanar errors of various subgroups of the ata. ests Alpha his is the significance level of the equivalence tests. A value of 0.05 is recommene. ypical values range from 0.00 to 0.00. 37-0

ests Parametric Equivalence ests using 00( - α)% Confience Intervals of the Difference Check this option to show the results of Equivalence ests conucte by constructing 00( - α)% confience intervals of the ifference. Equivalence ests using 00( - α)% Confience Intervals of the Ratio Check this option to show the results of Equivalence ests conucte by constructing 00( - α)% confience intervals of the ratio. Schuirmann's Equivalence est using OS (wo One-Sie ests) Check this option to show the results of Schuirmann's Equivalence est base on two, one-sie tests (OS). Anerson an Hauck's Equivalence est Check this option to show the results of Anerson an Hauck's Equivalence est. -ests for Perio an Carryover Effects (wo-sie) Check this option to isplay the results of the two-sie -ests for Perio an Carryover Effects. ests Nonparametric Schuirmann's Wilcoxon-Mann-Whitney Equivalence est using OS (wo One-Sie ests) his test is a nonparametric alternative to the Schuirmann's Equivalence est using OS for use when the assumption of normality is not vali. his test uses the ranks of the perio ifferneces rather than the perio ifferences themselves. here are 3 ifferent tests that can be conucte: Exact est he exact test can be calculate if there are no ties an the sample size is 0 in both groups. his test is recommene when these conitions are met. Normal Approximation est he normal approximation metho may be use to approximate the istribution of the sum of ranks when the sample size is reasonably large. Normal Approximation est with Continuity Correction he normal approximation with continuity correction may be use to approximate the istribution of the sum of ranks when the sample size is reasonably large. Assumptions ests of Assumptions his report gives a series of four tests of the normality assumptions. hese tests are: Shapiro-Wilk Normality Skewness Normality Kurtosis Normality Omnibus Normality 37-

Assumptions Alpha Assumptions Alpha is the significance level use in all the assumptions tests. A value of 0.05 is typically use for hypothesis tests in general, but values other than 0.05 are often use for the case of testing assumptions. ypical values range from 0.00 to 0.0. Aitional Information Show Written Explanations Check this option to isplay written etails about each report an plot. Report Options ab Report Options Variable Names his option lets you select whether to isplay only variable names, variable labels, or both. Value Labels his option applies to the Group Variable(s). It lets you select whether to isplay ata values, value labels, or both. Use this option if you want the output to automatically attach labels to the values (like Yes, No, etc.). See the section on specifying Value Labels elsewhere in this manual. Report Options Decimal Places Precision Specify the precision of numbers in the report. A single-precisioumber will show seven-place accuracy, while a ouble-precisioumber will show thirteen-place accuracy. Note that the reports were formatte for single precision. If you select ouble precision, some numbers may run into others. Also note that all calculations are performe in ouble precision regarless of which option you select here. his is for reporting purposes only. Means... est Statistics Decimals Specify the number of igits after the ecimal point to isplay on the output of values of this type. Note that this option io way influences the accuracy with which the calculations are one. Plots ab he options on this panel control the appearance of various plots. Select Plots Means Plot Probability Plots Each of these options inicates whether to isplay the inicate plots. Click the plot format button to change the plot settings. 37-

Example x Cross-Over Analysis for Equivalence Valiation using Chow an Liu (999)) his section presents an example of how to perform an equivalence test in an analysis of ata from a x crossover esign. Chow an Liu (999) page 73 provie an example of ata from a x cross-over esign. hese ata were shown in the Data Structure section earlier in this chapter. In this example, we ll assume that the test formulation will be eeme equivalent if it s mean is no more than 0% above or below the stanar or reference formulation. Chow an Liu (999) inicates on pages 85-87 that the lower an upper 90% confience limits of the ifference for the Shortest C.I. Equivalence test metho are -8.698 an 4.3, respectively. he lower an upper 90% confience limits of the ratio for the Shortest C.I. Equivalence test metho are 89.46% an 04.99%, respectively. Westlake s Symmetric 90% C.I. lower an upper limits for the ifference are -7.43 an 7.43, respectively. Westlake s Symmetric 90% C.I. lower an upper limits for the ratio are 9.0% an 08.98%, respectively. In all cases, equivalence is conclue. Chow an Liu (999) inicates on page 98 that the -values for the lower- an upper-taile tests for Schuirmann's Equivalence est using OS are 3.80 an -5.036, respectively. he example on page 0 states that p-value for Anerson an Hauck s Equivalence test is 0.000454. We will use the ata foun the CHOWLIU73 atabase. You may follow along here by making the appropriate entries or loa the complete template Example from the emplate tab of the Analysis of x Cross-Over Designs using -ests for Equivalence winow. Open the ChowLiu73 ataset. From the File menu of the NCSS Data winow, select Open Example Data. Click on the file ChowLiu73.NCSS. Click Open. Open the winow. Using the Analysis menu or the Proceure Navigator, fin an select the Analysis of x Cross-Over Designs using -ests for Equivalence proceure. On the menus, select File, then New emplate. his will fill the proceure with the efault template. 3 Specify the variables an equivalence options. On the winow, select the Variables tab. For Sequence Group Variable enter Sequence. For Perio Variable enter Perio. For Perio Variable enter Perio. For Upper Equivalence Limit enter 0 an check %. For Lower Equivalence Limit enter Upper Limit. 4 Specify the reports. On the winow, select the Reports tab. Check Show Written Explanations. 5 Run the proceure. From the Run menu, select Run Proceure. Alternatively, just click the green Run button. he following reports an charts will be isplaye in the Output winow. 37-3

Cross-Over Effects an Means Summary Lower Upper Parameter Stanar Stanar 95.0% 95.0% Parameter Count Estimate Deviation Error * CL CL reatment Effect 4 -.88 9.45 3.733.0739-0.030 5.455 μ (or μr) 4 8.559 4.85 μ (or μ) 4 80.7 4.395 Perio Effect 4 -.73 9.45 3.733.0739-9.474 6.0 μ (Perio) 4 8.8 4.043 μ (Perio) 4 80.550 4.68 Carryover Effect 4-9.59 38.390 5.673.0739-4.095.9 (R+) (Seq) 67.67 33.34 9.594 (R+) (Seq) 58.035 4.930.393 Interpretation of the Above Report his report shows the estimate effects, means, stanar eviations, stanar errors, an confience limits of various parameters an subgroups of the ata. he least squares mean of treatment R is 8.559 an of treatment is 80.7. he treatment effect [(μ - μ) or (μ - μr)] is estimate to be -.88. he perio effect [(μ (Perio)) - (μ (Perio))] is estimate to be -.73. he carryover effect [((R+) (Seq)) - ((R+) (Seq))] is estimate to be -9.59. Note that least squares means are create by taking the simple average of their component means, not by taking the average of the raw ata. For example, if the mean of the 0 subjects in perio sequence is 50.0 an the mean of the 0 subjects in perio sequence is 40.0, the least squares mean is (50.0 + 40.0)/ 45.0. hat is, no ajustment is mae for the unequal sample sizes. Also note that the stanar eviation of some of the subgroups is not calculate. his report summarizes the estimate means an the treatment, perio, an carryover effects. Parameter hese are the items isplaye on the corresponing lines. Note that the reatment line is the main focus of the analysis. he Perio an Carryover information is use for preliminary tests of assumptions. Count he count gives the number of non-missing values. his value is often referre to as the group sample size or n. Parameter Estimate hese are the estimate values of the corresponing parameters. Formulas for the three effects were given in the echnical Details section earlier in this chapter. Stanar Deviation he sample stanar eviation is the square root of the sample variance. It is a measure of sprea. Stanar Error hese are the stanar errors of each of the effects. hey provie an estimate of the precision of the effect estimate. he formulas were given earlier in the echnical Details section of this chapter. * his is the t-value use to construct the confience interval. If you were constructing the interval manually, you woul obtain this value from a table of the Stuent s t istribution with n - egrees of freeom. Lower an Upper Confience Limits hese values provie a confience interval for the estimate effect. Interpretation of the Above Report his section provies a written interpretation of the above report. 37-4

Cross-Over Analysis Detail Least Squares Stanar Stanar Seq. Perio reatment Count Mean Deviation Error R 85.83 5.69 4.530 R 79.96 5.98 7.74 8.804 9.7 5.690 78.740 3.07 6.699 Difference (-R)/ -.009 6.43.854 Difference (-R)/ 0.78.5 3.40 otal R+ 67.67 33.34 9.594 otal R+ 58.035 4.930.393.. R 4 8.559 4.85.. 4 80.7 4.395.. 4 83.84.. 4 79.08.. 4 8.8 4.043.. 4 80.550 4.68 Interpretation of the Above Report his report shows the estimate effects, means, stanar eviations, an stanar errors of various subgroups of the ata. he least squares mean of treatment R is 8.559 an of treatment is 80.7. Note that least squares means are create by taking the simple average of their component means, not by taking the average of the raw ata. For example, if the mean of the 0 subjects in perio sequence is 50.0 an the mean of the 0 subjects in perio sequence is 40.0, the least squares mean is (50.0 + 40.0)/ 45.0. hat is, no ajustment is mae for the unequal sample sizes. Also note that the stanar eviation an stanar error of some of the subgroups are not calculate. his report provies the least squares means of various subgroups of the ata. Seq. his is the sequence number of the mean shown on the line. When the ot (perio) appears in this line, the results isplaye are create by taking the simple average of the appropriate means of the two sequences. Perio his is the perio number of the mean shown on the line. When the ot (perio) appears in this line, the results isplaye are create by taking the simple average of the appropriate means of the two perios. reatment his is the treatment (or formulation) of the mean shown on the line. When the ot (perio) appears in this line, the results isplaye are create by taking the simple average of the appropriate means of the two treatments. When the entry is (-R)/, the mean is compute on the quantities create by iviing the ifference in each subject s two scores by. When the entry is R+, the mean is compute on the sums of the subjects two scores. Count he count is the number of subjects in the mean. Least Squares Mean Least squares means are create by taking the simple average of their component means, not by taking a weighte average base on the sample size in each component. For example, if the mean of the 0 subjects in perio sequence is 50.0 an the mean of the 0 subjects in perio sequence is 40.0, the least squares mean is (50.0 + 40.0)/ 45.0. hat is, no ajustment is mae for the unequal sample sizes. Since least squares means are use in all subsequent calculations, these are the means that are reporte. 37-5

Stanar Deviation his is the estimate stanar eviation of the subjects in the mean. Stanar Error his is the estimate stanar error of the least squares mean. Equivalence ests using 00( - α)% Confience Intervals of the Difference Lower Lower 90.0% Upper 90.0% Upper Conclue Equivalence Confience Confience Equivalence Equivalence est ype Limit Limit Limit Limit at α 0.050? Shortest C.I. -6.5-8.698 4.3 6.5 Yes Westlake C.I. -6.5-7.43 7.43 6.5 Yes Note: Westlake's k -.3730 an k.5984. Interpretation of the Above Report Average bioequivalence of the two treatments has been foun at the 0.05 significance level using the shortest confience interval of the ifference approach since both confience limits, -8.698 an 4.3, are between the acceptance limits of -6.5 an 6.5. his experiment use a x cross-over esign with subjects in sequence an subjects in sequence. Average bioequivalence of the two treatments has been foun at the 0.05 significance level using Westlake's confience interval of the ifference approach since both confience limits, -7.43 an 7.43, are between the acceptance limits of -6.5 an 6.5. his experiment use a x cross-over esign with subjects in sequence an subjects in sequence. his report provies the results of two tests for bioequivalence base on confience limits of the ifference between the means of the two formulations. he results match Chow an Liu (999) exactly. est ype his is the type of test reporte on this line. he mathematical etails of each test were escribe earlier in the echnical Details section of this chapter. Lower an Upper Equivalence Limit hese are the limits on bioequivalence. As long as the ifference between the treatment formulation an reference formula is insie these limits, the treatment formulation is bioequivalent. hese values were set by you. hey are not calculate from the ata. Lower an Upper Confience Limits hese are the confience limits on the ifference in response to the two formulations compute from the ata. Note that the confience coefficient is ( α ) 00%. If both of these limits are insie the two equivalence limits, the treatment formulation is bioequivalent to the reference formulation. Otherwise, it is not. Conclue Equivalence at α 0.050? his column inicates whether bioequivalence can be conclue. 37-6

Equivalence ests using 00( - α)% Confience Intervals of the Ratio Lower Lower 90.0% Upper 90.0% Upper Conclue Equivalence Confience Confience Equivalence Equivalence est ype Limit Limit Limit Limit at α 0.050? Shortest C.I. 80.000 89.464 04.994 0.000 Yes Westlake C.I. 80.000 9.0 08.979 0.000 Yes Fieller's C.I. 80.000 90.063 04.97 0.000 Yes Interpretation of the Above Report Average bioequivalence of the two treatments has been foun at the 0.05 significance level using the shortest confience interval of the ratio approach since both confience limits, 89.464 an 04.994, are between the acceptance limits of 80.000 an 0.000. his experiment use a x cross-over esign with subjects in sequence an subjects in sequence. Average bioequivalence of the two treatments has been foun at the 0.05 significance level using Westlake's confience interval of the ratio approach since both confience limits, 9.0 an 08.979, are between the acceptance limits of 80.000 an 0.000. his experiment use a x cross-over esign with subjects in sequence an subjects in sequence. Average bioequivalence of the two treatments has been foun at the 0.05 significance level using Fieller's confience interval of the ratio approach since both confience limits, 90.063 an 04.97, are between the acceptance limits of 80.000 an 0.000. his experiment use a x cross-over esign with subjects in sequence an subjects in sequence. his report provies the results of three tests for bioequivalence base on confience limits of the ratio of the mean responses to the two formulations. he results match Chow an Liu (999) exactly. est ype his is the type of test report on this line. he mathematical etails of each test were escribe earlier in the echnical Details section of this chapter. Lower an Upper Equivalence Limit hese are the limits on bioequivalence in percentage form. As long as the percentage of the treatment formulation of the reference formula is between these limits, the treatment formulation is bioequivalent. hese values were set by you. hey are not calculate from the ata. Lower an Upper Confience Limits hese are the confience limits on the ratio of mean responses to the two formulations compute from the ata. Note that the confience coefficient is ( α ) 00%. If both of these limits are insie the two equivalence limits, the treatment formulation is bioequivalent to the reference formulation. Otherwise, it is not. Conclue Equivalence at α 0.050? his column inicates whether bioequivalence can be conclue. 37-7

Schuirmann's Equivalence est using OS (wo One-Sie ests) reatment Lower Upper Conclue Alternative Mean est est Prob Equivalence Hypothesis Diff SE -Value -Value DF Level at α 0.050? -6.5 < μ-μr < 6.5 -.88 3.733 3.80-5.0356 0.00048 Yes Interpretation of the Above Report Average bioequivalence of the two treatments was foun at the 0.05 significance level using Schuirmann's two one-sie t-tests proceure. he probability level of the t-test of whether the treatment mean is not too much lower than the reference mean is 0.00048. he probability level of the t-test of whether the treatment mean is not too much higher than the reference mean is 0.0000. Since both of these values are less than 0.05, the null hypothesis of average bioinequivalence was rejecte in favor of the alternative hypothesis of average bioequivalence. his experiment use a x cross-over esign with subjects in sequence an subjects in sequence. his report provies the results of Schuirmann s two one-sie hypothesis tests proceure. he results match Chow an Liu (999) exactly. Alternative Hypothesis his is the alternative hypothesis of equivalence that is being teste. Lower an Upper est Value hese are the values of L an U, the two one-sie test statistics. DF his is the value of the egrees of freeom. In this case, the value of the egrees of freeom is n + n. Prob Level his is the probability level (p-value) of the test. If this value is less than the chosen significance level, then the corresponing effect is sai to be significant. For example, if you are testing at a significance level of 0.05, then probabilities that are less than 0.05 are statistically significant. You shoul choose a value appropriate for your stuy. Conclue Equivalence at α 0.050? his column inicates whether bioequivalence is conclue. Anerson an Hauck's Equivalence est reatment Conclue Alternative Mean Prob Equivalence Hypothesis Diff SE Pr(-L) Pr(U) Level at α 0.050? -6.5 < μ-μr < 6.5 -.88 3.733 0.00048 0.0000 0.00045 Yes Interpretation of the Above Report Average bioequivalence of the two treatments was foun at the 0.05 significance level using Anerson an Hauck's test proceure. he actual probability level of the test was 0.00045. his experiment use a x cross-over esign with subjects in sequence an subjects in sequence. his report provies the results of Anerson an Hauck s hypothesis test proceure. he results match Chow an Liu (999) exactly. Alternative Hypothesis his is the alternative hypothesis of equivalence that is being teste. 37-8

reatment Mean Difference his is the ifference between the treatment means, μμ μμ. his is known as the treatment effect. SE his is the stanar error of the treatment effect. It provies an estimate of the precision of the treatment effect estimate. he formula was given earlier in the echnical Details section of this chapter. Pr(-L) an Pr(U) hese values are subtracte to obtain the significance level of the test. Prob Level his is the significance level of the test. Bioequivalence is inicate when this value is less than a given level of α. Conclue Equivalence at α 0.050? his column inicates whether bioequivalence is conclue. Schuirmann's Wilcoxon-Mann-Whitney Equivalence est using OS (wo One-Sie ests) Lower Lower Upper Upper Conclue Alternative Sum Prob Sum Prob Equivalence est ype Hypothesis Ranks Level Ranks Level at α 0.050? Exact* -6.5 < Diff < 6.5 07 0.0005 9 0.0004 Yes Normal Approximation -6.5 < Diff < 6.5 07 0.00050 9 0.00033 Yes Normal Approx. with C.C. -6.5 < Diff < 6.5 07 0.00055 9 0.00037 Yes "Diff" refers to the location ifference between the perio ifferences for groups an. * he Exact est is provie only when there are no ties an the sample size is 0 in both groups. Interpretation of the Above Report Average bioequivalence of the two treatments was foun at the 0.05 significance level using the nonparametric version of Schuirmann's two one-sie tests proceure which is base on the Exact Wilcoxon-Mann-Whitney test. he probability level of the test of whether the treatment mean is not too much lower than the reference mean is 0.0005. he probability level of the test of whether the treatment mean is not too much higher than the reference mean is 0.0004. Since both of these values are less than 0.05, the null hypothesis of average bioinequivalence was rejecte in favor of the alternative hypothesis of average bioequivalence. his experiment use a x cross-over esign with subjects in sequence an subjects in sequence. his report provies the results of the nonparametric version of Schuirmann s two one-sie hypothesis tests proceure. he results match Chow an Liu (999) page 3 which states that rank sums are 07 an 9. est ype his is the type of test reporte on this line. Alternative Hypothesis his is the alternative hypothesis of equivalence that is being teste. Lower an Upper Sum Ranks hese are sums of the ranks for the lower an upper Mann-Whitney tests. Lower an Upper Prob Level hese are the upper an lower significance levels of the two one-sie Wilcoxon-Mann-Whitney tests. Bioequivalence is inicate when both of these values are less than a given level of α. 37-9

Conclue Equivalence at α 0.050? his column inicates whether bioequivalence is conclue. -ests for Perio an Carryover Effects (wo-sie) Stanar Prob Reject H0 Parameter Estimate Error -Value DF Level at α 0.050? Perio Effect -.73 3.733-0.4637 0.64739 No Carryover Effect -9.59 5.673-0.60 0.5468 No Interpretation of the Above Report A preliminary test faile to reject the assumption of equal perio effects at the 0.05 significance level (the actual probability level was 0.64739). A preliminary test faile to reject the assumption of equal carryover effects at the 0.05 significance level (the actual probability level was 0.5468). his report presents the -tests for the perio an carryover effects. In this case, both are not significantly ifferent from zero. Parameter hese are the items being teste. he Perio an Carryover lines are preliminary tests of assumptions. Estimate hese are the estimate values of the corresponing effects. Formulas for these effects were given in the echnical Details section earlier in this chapter. Stanar Error hese are the stanar errors of each of the effects. hey provie an estimate of the precision of the effect estimate. he formulas were given earlier in the echnical Details section of this chapter. -Value hese are the test statistics calculate from the ata that are use to test whether the effect is ifferent from zero. DF he DF is the value of the egrees of freeom. his is two less than the total number of subjects in the stuy. Prob Level his is the probability level (p-value) of the test. If this value is less than the chosen significance level, then the corresponing effect is sai to be significant. Some authors recommen that the tests of assumptions (Perio an Carryover) shoul be one at the 0.0 level of significance. 37-0

ests of Assumptions Section his section presents the results of tests for checking the normality assumption. ests of the Normality Assumption for the Perio Differences in Sequence Reject H0 of Normality Normality est est Statistic Prob Level at α 0.050? Shapiro-Wilk 0.948 0.570 No Skewness -0.7849 0.435 No Kurtosis 0.366 0.7767 No Omnibus (Skewness or Kurtosis) 0.7468 0.68839 No ests of the Normality Assumption for the Perio Differences in Sequence Reject H0 of Normality Normality est est Statistic Prob Level at α 0.050? Shapiro-Wilk 0.909 0.0784 No Skewness 0.97 0.3638 No Kurtosis -0.8364 0.4093 No Omnibus (Skewness or Kurtosis).537 0.4647 No Shapiro-Wilk Normality his test for normality has been foun to be the most powerful test in most situations. It is the ratio of two estimates of the variance of a normal istribution base on a ranom sample of n observations. he numerator is proportional to the square of the best linear estimator of the stanar eviation. he enominator is the sum of squares of the observations about the sample mean. he test statistic W may be written as the square of the Pearson correlation coefficient between the orere observations an a set of weights which are use to calculate the numerator. Since these weights are asymptotically proportional to the corresponing expecte normal orer statistics, W is roughly a measure of the straightness of the normal quantile-quantile plot. Hence, the closer W is to one, the more normal the sample is. he probability values for W are vali for sample sizes greater than 3. he test was evelope by Shapiro an Wilk (965) for sample sizes up to 0. NCSS uses the approximations suggeste by Royston (99) an Royston (995) which allow unlimite sample sizes. Note that Royston only checke the results for sample sizes up to 5000, but inicate that he saw no reason larger sample sizes shoul not work. W may not be as powerful as other tests when ties occur in your ata. Skewness Normality his is a skewness test reporte by D Agostino (990). Skewness implies a lack of symmetry. One characteristic of the normal istribution is that it has no skewness. Hence, one type of non-normality is skewness. he Value is the test statistic for skewness, while Prob Level is the p-value for a two-taile test for a null hypothesis of normality. If this p-value is less than a chosen level of significance, there is evience of nonnormality. Uner Decision (α 0.050), the conclusion about skewness normality is given. Kurtosis Normality Kurtosis measures the heaviness of the tails of the istribution. D Agostino (990) reporte a secon normality test that examines kurtosis. he Value column gives the test statistic for kurtosis, an Prob Level is the p-value for a two-tail test for a null hypothesis of normality. If this p-value is less than a chosen level of significance, there is evience of kurtosis non-normality. Uner Decision (α 0.050), the conclusion about normality is given. Omnibus Normality his thir normality test, also evelope by D Agostino (990), combines the skewness an kurtosis tests into a single measure. Here, as well, the null hypothesis is that the unerlying istribution is normally istribute. he efinitions for Value, Prob Level, an Decision are the same as for the previous two normality tests. 37-

Plot of Sequence-by-Perio Means he sequence-by-perio means plot shows the mean responses on the vertical axis an the perios on the horizontal axis. he lines connect like treatments. he istance between these lines represents the magnitue of the treatment effect. If there is no perio, carryover, or interaction effects, two horizontal lines will be isplaye. he tenency for both lines to slope up or own represents perio an carryover effects. he tenency for the lines to cross represents perio-by-treatment interaction. his is also a type of carryover effect. Plot of Subject Profiles he profile plot isplays the raw ata for each subject. he response variable is shown along the vertical axis. he two sequences are shown along the horizontal axis. he ata for each subject is epicte by two points connecte by a line. he subject s response to the reference formulation is shown first followe by their response to the treatment formulation. Hence, for sequence, the results for the first perio are shown on the right an for the secon perio on the left. 37-

his plot is use to evelop a feel for your ata. You shoul view it first as a tool to check for outliers (points an subjects that are very ifferent from the majority). Note that outliers shoul be remove from the analysis only if a reason can be foun for their eletion. Of course, the first step in ealing with outliers is to ouble-check the ata values to etermine if a typing error might have cause them. Also, look for subjects whose lines exhibit a very ifferent pattern from the rest of the subjects in that sequence. hese might be a signal of some type of atarecoring or ata-entry error. he profile plot allows you to assess the consistency of the responses to the two treatments across subjects. You may also be able to evaluate the egree to which the variation is equal in the two sequences. Plot of Sums an Differences he sums an ifferences plot shows the sum of each subject s two responses on the horizontal axis an the ifference between each subject s two responses on the vertical axis. Dot plots of the sums an ifferences have been ae above an to the right, respectively. Each point represents the sum an ifference of a single subject. Different plotting symbols are use to enote the subject s sequence. A horizontal line has been ae at zero to provie an easy reference from which to etermine if a ifference is positive (favors treatment R) or negative (favors treatment ). he egree to which the plotting symbols ten to separate along the horizontal axis represents the size of the carryover effect. he egree to which the plotting symbols ten to separate along the vertical axis represents the size of the treatment effect. Outliers are easily etecte on this plot. Outlying subjects shoul be reviewe for ata-entry errors an for special conitions that might have cause their responses to be unusual. Outliers shoul not be remove from an analysis just because they are ifferent. A compelling reason shoul be foun for their removal an the removal shoul be well ocumente. 37-3

Perio Plot he Perio Plot isplays a subject s perio response on the horizontal axis an their perio response on the vertical axis. he plotting symbol is the sequence number. he plot is use to fin outliers an other anomalies. Probability Plots hese plots show the ifferences (P-P) on the vertical axis an values on the horizontal axis that woul be expecte if the ifferences were normally istribute. he first plot shows the ifferences for sequence an the secon plot shows the ifferences for sequence. If the assumption of normality hols, the points shoul fall along a straight line. he egree to which the points are off the line represents the egree to which the normality assumption oes not hol. Since the normality of these ifferences is assume by the t-test use to test for a ifference between the treatments, these plots are useful in assessing whether that assumption is vali. If the plots show a pronounce pattern of non-normality, you might try taking the square roots or the logs of the responses before beginning the analysis. 37-4