Test Volume 12, Number 1. June 2003

Similar documents
Tests for Two Independent Sensitivities

Superiority by a Margin Tests for the Ratio of Two Proportions

Non-Inferiority Tests for the Odds Ratio of Two Proportions

Non-Inferiority Tests for the Ratio of Two Proportions

Equivalence Tests for the Odds Ratio of Two Proportions

Multinomial Logit Models for Variable Response Categories Ordered

Non-Inferiority Tests for the Difference Between Two Proportions

Equivalence Tests for One Proportion

Inferences on Correlation Coefficients of Bivariate Log-normal Distributions

Australian Journal of Basic and Applied Sciences. Conditional Maximum Likelihood Estimation For Survival Function Using Cox Model

Log-linear Modeling Under Generalized Inverse Sampling Scheme

Statistical Methodology. A note on a two-sample T test with one variance unknown

A Convenient Way of Generating Normal Random Variables Using Generalized Exponential Distribution

Group-Sequential Tests for Two Proportions

Sample Size for Assessing Agreement between Two Methods of Measurement by Bland Altman Method

Confidence Intervals for One-Sample Specificity

Measuring the Benefits from Futures Markets: Conceptual Issues

Confidence Intervals for the Median and Other Percentiles

A generalized Hosmer Lemeshow goodness-of-fit test for multinomial logistic regression models

[D7] PROBABILITY DISTRIBUTION OF OUTSTANDING LIABILITY FROM INDIVIDUAL PAYMENTS DATA Contributed by T S Wright

Technical Note: An Improved Range Chart for Normal and Long-Tailed Symmetrical Distributions

GMM for Discrete Choice Models: A Capital Accumulation Application

On Some Statistics for Testing the Skewness in a Population: An. Empirical Study

SAMPLE STANDARD DEVIATION(s) CHART UNDER THE ASSUMPTION OF MODERATENESS AND ITS PERFORMANCE ANALYSIS

Probability Distributions: Discrete

GENERATION OF STANDARD NORMAL RANDOM NUMBERS. Naveen Kumar Boiroju and M. Krishna Reddy

Tests for the Odds Ratio in a Matched Case-Control Design with a Binary X

Robust Critical Values for the Jarque-bera Test for Normality

The Duration Derby: A Comparison of Duration Based Strategies in Asset Liability Management

MEASURING PORTFOLIO RISKS USING CONDITIONAL COPULA-AR-GARCH MODEL

Annual risk measures and related statistics

Multivariate longitudinal data analysis for actuarial applications

Chapter 7: Estimation Sections

A Skewed Truncated Cauchy Logistic. Distribution and its Moments

Equivalence Tests for Two Correlated Proportions

Tests for Two ROC Curves

Implementing Personalized Medicine: Estimating Optimal Treatment Regimes

Tests for the Difference Between Two Poisson Rates in a Cluster-Randomized Design

A New Multivariate Kurtosis and Its Asymptotic Distribution

Chapter 3 Discrete Random Variables and Probability Distributions

The normal distribution is a theoretical model derived mathematically and not empirically.

University of California Berkeley

Review: Population, sample, and sampling distributions

Impact of Weekdays on the Return Rate of Stock Price Index: Evidence from the Stock Exchange of Thailand

Statistics 431 Spring 2007 P. Shaman. Preliminaries

Omitted Variables Bias in Regime-Switching Models with Slope-Constrained Estimators: Evidence from Monte Carlo Simulations

Probabilistic Analysis of the Economic Impact of Earthquake Prediction Systems

An Information Based Methodology for the Change Point Problem Under the Non-central Skew t Distribution with Applications.

Tests for the Matched-Pair Difference of Two Event Rates in a Cluster- Randomized Design

European Journal of Economic Studies, 2016, Vol.(17), Is. 3

Power of t-test for Simple Linear Regression Model with Non-normal Error Distribution: A Quantile Function Distribution Approach

Contents Part I Descriptive Statistics 1 Introduction and Framework Population, Sample, and Observations Variables Quali

Non-Inferiority Tests for the Ratio of Two Means

3 Arbitrage pricing theory in discrete time.

Operational Risk Aggregation

On the Distribution and Its Properties of the Sum of a Normal and a Doubly Truncated Normal

Analysis of truncated data with application to the operational risk estimation

E509A: Principle of Biostatistics. GY Zou

Logarithmic-Normal Model of Income Distribution in the Czech Republic

Tolerance Intervals for Any Data (Nonparametric)

Rules and Models 1 investigates the internal measurement approach for operational risk capital

Probability Distributions: Discrete

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

Copyright 2005 Pearson Education, Inc. Slide 6-1

The Vasicek Distribution

Logit Models for Binary Data

Two-Sample Z-Tests Assuming Equal Variance

BIO5312 Biostatistics Lecture 5: Estimations

The Cost of Capital for the Closely-held, Family- Controlled Firm

The Two Sample T-test with One Variance Unknown

Operational Risk Aggregation

Volume 30, Issue 1. Samih A Azar Haigazian University

ELEMENTS OF MONTE CARLO SIMULATION

Keywords Akiake Information criterion, Automobile, Bonus-Malus, Exponential family, Linear regression, Residuals, Scaled deviance. I.

NBER WORKING PAPER SERIES A REHABILITATION OF STOCHASTIC DISCOUNT FACTOR METHODOLOGY. John H. Cochrane

Gamma Distribution Fitting

**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:

Math 361. Day 8 Binomial Random Variables pages 27 and 28 Inv Do you have ESP? Inv. 1.3 Tim or Bob?

Mendelian Randomization with a Binary Outcome

Bootstrap Inference for Multiple Imputation Under Uncongeniality

Binomial distribution

Much of what appears here comes from ideas presented in the book:

Game Theory-based Model for Insurance Pricing in Public-Private-Partnership Project

EX-POST VERIFICATION OF PREDICTION MODELS OF WAGE DISTRIBUTIONS

Monitoring Processes with Highly Censored Data

MODELLING OF INCOME AND WAGE DISTRIBUTION USING THE METHOD OF L-MOMENTS OF PARAMETER ESTIMATION

Consistent estimators for multilevel generalised linear models using an iterated bootstrap

8: Economic Criteria

ADVANCED OPERATIONAL RISK MODELLING IN BANKS AND INSURANCE COMPANIES

Sampling & Confidence Intervals

Chapter 5 Univariate time-series analysis. () Chapter 5 Univariate time-series analysis 1 / 29

A NEW POINT ESTIMATOR FOR THE MEDIAN OF GAMMA DISTRIBUTION

The risk/return trade-off has been a

Modelling strategies for bivariate circular data

On Maximizing Annualized Option Returns

Conover Test of Variances (Simulation)

Using New SAS 9.4 Features for Cumulative Logit Models with Partial Proportional Odds Paul J. Hilliard, Educational Testing Service (ETS)

A Simple Utility Approach to Private Equity Sales

MEASURING TRADED MARKET RISK: VALUE-AT-RISK AND BACKTESTING TECHNIQUES

continuous rv Note for a legitimate pdf, we have f (x) 0 and f (x)dx = 1. For a continuous rv, P(X = c) = c f (x)dx = 0, hence

Transcription:

Sociedad Española de Estadística e Investigación Operativa Test Volume 12, Number 1. June 2003 Power and Sample Size Calculation for 2x2 Tables under Multinomial Sampling with Random Loss Kung-Jong Lui Department of Mathematics and Statistics. San Diego State University. William G. Cumberland Department of Biostatistics. University of California, Los Angeles. Sociedad de Estadística e Investigación Operativa Test (2003) Vol. 12, No. 1, pp. 141 152

Sociedad de Estadística e Investigación Operativa Test (2003) Vol. 12, No. 1, pp. 141 152 Power and Sample Size Calculation for 2x2 Tables under Multinomial Sampling with Random Loss Kung-Jong Lui Department of Mathematics and Statistics. San Diego State University. William G. Cumberland Department of Biostatistics. University of California, Los Angeles. Abstract Multinomial sampling, in which the total number of sampled subjects is fixed, is probably one of the most commonly used sampling schemes in categorical data analysis. When we apply multinomial sampling to collect subjects who are subject to a random exclusion from our data analysis, the number of subjects falling into each comparison group is random and can be small with a positive probability. Thus, the application of the traditional statistics derived from large sample theory for testing equality between two independent proportions can sometimes be theoretically invalid. On the other hand, using Fisher s exact test can always assure that the true type I error is less than or equal to a nominal α-level. Thus, we discuss here power and sample size calculation based on this exact test. For a desired power at a given α-level, we develop an exact sample size calculation procedure, that accounts for a random loss of sampled subjects, for testing equality between two independent proportions under multinomial sampling. Because the exact sample size calculation procedure requires intensive computations when the underlying required sample size is large, we also present an approximate sample size formula using large sample theory. On the basis of Monte Carlo simulation, we note that the power of using this approximate sample size formula generally agrees well with the desired power on the basis of the exact test. Finally, we propose a trial-and-error procedure using the approximate sample size as an initial estimate and Monte Carlo simulation to expedite the procedure for searching the minimum required sample size. Key Words: Sample size determination, Fisher s exact test, multinomial sampling, power AMS subject classification: 62F03, 62A05. Correspondence to: K.-J. Lui, Department of Mathematics and Statistics, San Diego State University, San Diego, CA 92182, USA. E-mail: kjl@rohan.sdsu.edu Received: November 2001; Accepted: May 2002

142 K. J. Lui and W. G. Cumberland 1 Introduction Multinomial sampling, in which the total number of studied subjects is fixed, is probably one of the most commonly considered sampling designs in categorical data analysis (Bishop et al., 1975). Consider an epidemiological prevalence study, in which we take a random sample from a general population and want to compare the prevalence of disease between the exposure and the nonexposure subpopulations to a risk factor of interest. Or consider a clinical trial, in which we randomly assign each patient to receive treatment A or B with fixed probabilities, and wish to compare the response rate between two treatments. In either of the above cases, the number of subjects falling into the comparison groups is random. Furthermore, it is common that we may need to exclude some sampled subjects due to missing information from our data. As noted elsewhere (Skalski, 1992; Lui, 1994), sample size determination failing to take into account the potential loss of subjects can result in studies with inadequate power. Because the number of sampled subjects falling into the two comparison groups can be small with a positive probability under multinomial sampling with a random loss of sampled subjects, traditional statistics using large sample theory for testing equality between two independent proportions (Fleiss, 1981) can sometimes be theoretically invalid. However, using Fisher s exact test (Fisher, 1935; Irwin, 1935; Yates, 1934; Fleiss, 1981) can always assure that the true type I error is less than or equal to a nominal α-level regardless of the number of subjects from the subpopulations. This leads us to concentrate our discussion on power and sample size calculation on the basis of the exact test. Note that numerous publications on calculation of power and sample size based on the exact test under the product binomial sampling appear elsewhere (Bennett and Hsu, 1960; Haseman, 1978; Gail and Gart, 1973; Casagrande et al., 1978a; Gordon, 1994). Recently, an excellent and systematic review of sample size determination for testing differences in proportions under the two-sample design also appears in Sahai and Khurshid (1996). However, none of these papers focuses discussion on sample size calculation under multinomial sampling with a random exclusion of sampled subjects from data analysis as is done here. The purpose of this paper is to extend the sample size calculation procedure proposed elsewhere (Bennett and Hsu, 1960) to accommodate multinomial sampling with a random loss of sampled subjects. To provide readers with an insight of the effects due to different parameters, this paper cal-

Sample Size Calculation with Random Loss 143 culates the power based on the exact multinomial distribution in a variety of situations. Because the exact sample size calculation procedure involves intensive computations when the required sample size is large, this paper presents an approximate sample size formula using the large sample theory as well. Using Monte Carlo simulation, this paper finds that the power of using an approximate sample size formula derived from a method analogous to that proposed elsewhere (Casagrande et al., 1978b; Fleiss et al., 1980; Fleiss, 1981) can actually be quite accurate. Finally, this paper suggests a trial-and-error procedure using the approximate sample size as an initial estimate and Monte Carlo simulation to expedite the procedure for finding the minimum required sample size for a desired power at a nominal α-level. 2 Notations, Power, and Sample Size Determination Suppose that we take a random sample of n subjects, each having probability p e of falling into one comparison group, and probability 1 p e of falling into the other comparison group. For example, the probability p e may denote the population proportion of exposure in epidemiological prevalence studies or the probability of assigning subjects to an experimental treatment in clinical trials. Because the information regarding the exposure status or the outcome can sometimes be missing in prevalence studies, or the studied subjects can be lost to follow-up in clinical trials, we assume that each sampled subject has a positive probability p m to be excluded from our data analysis. For simplicity, we focus our discussion on the situation where the exclusion of a sampled subject is independent of both the exposure (or the treatment assignment) and the outcome status. Because the following discussion can be generally applied to test equality between the proportions of two comparison groups, we use the number 1 and 2 to designate these groups. Let N 1, N 2, and N 3 denote the random frequencies corresponding to groups 1 and 2, and the group of subjects who will be excluded from our comparison. The random vector N = (N 1, N 2, N 3 ) then follows a trinomial distribution: f N (n p e, p m ) = n! n 1!n 2!n 3! πn 1 1 πn 2 2 πn 3 3, (2.1) where n = (n 1, n 2, n 3 ), π 1 = (1 p m )p e, π 2 = (1 p m )(1 p e ), π 3 = p m, 0 n i n, and i n i = n.

144 K. J. Lui and W. G. Cumberland Let p 1 and p 2 denote the probabilities that a randomly selected subject from groups 1 and 2, respectively, has the outcome of interest. We consider first the situation for a one-sided test. Suppose that we want to test the null hypothesis H 0 : p 1 = p 2 versus the alternative hypothesis H a : p 1 > p 2. Let X i denote the number of subjects with the outcome of interest among the N i subjects from group i (i = 1, 2). Furthermore, let T = X 1 + X 2 denote the total number of subjects with the outcome of interest in the sample. Then, given N 1 = n 1, N 2 = n 2, and T = t, the conditional distribution of X 1 is well known to follow P (X 1 = x 1 t, n 1, n 2, p 1, p 2 ) = ( n1 b x=a )( n2 ) x 1 t x 1 φ x 1 ( n1 )( n2 ), (2.2) x t x φ x where a x 1 b, a = max(0, t n 2 ), b = min(t, n 1 ), and φ = p 1 (1 p 2 )/[(1 p 1 )p 2 ] is the odds ratio of possessing the outcome of interest between groups 1 and 2. When the null hypothesis H 0 : p 1 = p 2 (i.e., φ = 1) is true, the conditional distribution (2.2) of X 1 reduces to the hypergeometric distribution: P (X 1 = x 1 t, n 1, n 2, p 1 = p 2 ) = ( n1 )( n2 x 1 ). (2.3) ( n1 +n 2 t t x 1 ) Under the alternative hypothesis H a : p 1 > p 2, we expect the value of X 1 to be large. Thus, the critical region C(α) of a nominal α-level (one-sided test) consists of {X 1 : X 1 x 1 }, where x 1 is the smallest integer such that x 1 x P (X 1 = x 1 t, n 1, n 2, p 1 = p 2 ) α. The conditional power, given 1 n 1, n 2, and t, is then q(α, p 1, p 2 n 1, n 2, t) = P (X 1 = x 1 t, n 1, n 2, p 1, p 2 ), (2.4) x 1 C(α) where P (X 1 = x 1 t, n 1, n 2, p 1, p 2 ) is given by (2.2). Thus, the conditional power, given n 1 and n 2 is q(α, p 1, p 2 n 1, n 2 ) = n 1 +n 2 t=0 ( n1 x=a q(α, p 1, p 2 n 1, n 2, t)f T (t n 1, n 2 ), (2.5) where f T (t n 1, n 2 ) = b )( n2 ) x t x p x 1 (1 p 1 ) n1 x p t x 2 (1 p 2 ) n 2 (t x) and a = max(0, t n 2 ), b = min(t, n 1 ). Bennett and Hsu (1960) base their sample size calculation on (2.5) for studies in which the number of studied

Sample Size Calculation with Random Loss 145 subjects from each comparison group is fixed. We cannot directly apply (2.5) to calculate power for the situation in which n 1 and n 2 are random, nor when some n 3 subjects are randomly excluded from our data. Instead, we consider the expected power for a total sample size n with the given probabilities p e and p m. This expected power is q(α, p 1, p 2, n, p e, p m ) = n q(α, p 1, p 2 n 1, n 2 )f N (n p e, p m ), (2.6) where the summation is over all possible vector values for n = (n 1, n 2, n 3 ), and f N (n p e, p m ) is given in (2.1). For a desired power 1 β, we may use a trial-and-error procedure to find the minimum required sample size n such that the expected power q(α, p 1, p 2, n, p e, p m ) is 1 β. However, calculation of this expected power (2.6) can be very computationally intensive when the minimum required sample size n is large. Hence we need an approximate sample size formula for n. If n were very large, we would expect n i ( =. nπ i ) to be large as well, and the ratio n 2 /n 1 between groups 2 and 1 to be approximately equal to r = π 2 /π 1. Therefore, an approximation to the expected required sample size E(n 1 ) from group 1 for a desired power 1 β of rejecting the null hypothesis H 0 : p 1 = p 2 at α-level (one-sided test) when the alternative hypothesis H a : p 1 > p 2 is true is given by (Fleiss et al. 1980;Fleiss 1981, p. 45) { } n [Z 1a = ceiling α p(1 p)(r + 1) + Zβ rp1 (1 p 1 ) + p 2 (1 p 2 )] 2 r(p 1 p 2 ) 2, (2.7) where Z α is the upper 100(α)th percentile of the standard normal distribution, p = (p 1 +rp 2 )/(1+r), and ceiling {x} denotes the least integer greater than or equal to x. Note that when deriving sample size formula (2.7), we do not account for the continuity correction and hence using (2.7) tends to underestimate the expected required sample size from group 1 on the basis of the exact test (Casagrande et al., 1978b; Gordon, 1994). To alleviate this underestimation, we may want to apply the following adjustment formula which incorporates the continuity correction into the sample size determination (Fleiss et al., 1980; Fleiss, 1981; Casagrande et al., 1978b): n 1a = ceiling n 1a 4 ( 1 + 1 + ) 2 2(r + 1) n 1a r p 1 p 2. (2.8)

146 K. J. Lui and W. G. Cumberland These results suggest that an approximately minimum required sample size n with the continuity correction should be given by n a = ceiling{[n 1a + ceiling{n 1a r}]/(1 p m )}. (2.9) Note that the above discussions can be easily extended to accommodate hypothesis testing for a two-sided test. Consider testing the null hypothesis H 0 : p 1 = p 2 versus the alternative hypothesis H a : p 1 p 2. We reject the null hypothesis H 0 when X 1 is too large or too small. Thus, a critical region C(α) of a nominal α-level (two-sided test) consists of {X 1 : X 1 x 1 or X 1 x 1 }, where x 1 is the smallest integer such that x 1 x 1 P (X 1 = x 1 t, n 1, n 2, p 1 = p 2 ) α/2 and x 1 is the largest integer such that x 1 x P (X 1 = x 1 t, n 1, n 2, p 1 = p 2 ) α/2. With this critical 1 region C(α), we can calculate the expected power q(α, p 1, p 2, n, p e, p m ) (2.6) through use of (2.4) and (2.5). We can further find the minimum required sample size n for a desired power 1 β at a nominal α-level of two-sided test using (2.6). Similarly, we can substitute Z α for Z α/2 in (2.7) and apply (2.8) for the continuity correction to obtain an approximate sample size formula n a (2.9) for a two-sided test. 3 Power and Sample Size Calculation To illustrate the use of formula (2.6), we first calculate the expected power for the situations in which p e = 0.1, 0.30, p m = 0.10, 0.20, p 1 = 0.40, p 2 = 0.1, 0.20, 0.30, and n = 20 to 100 by 10, 120 to 200 by 20 at 0.05-level for one-sided and two sided tests using the exact multinomial distribution (2.1). For example, when p e = 0.1, p m = 0.10, p 1 = 0.40, p 2 = 0.1, and n = 180, the corresponding powers are 0.805 and 0.724 for one-sided and two-sided tests, respectively (Table 1). As we expect, we see that the power increases as either the total sample size n or the difference between p 1 and p 2 increases, but decreases as the probability of excluding a random selected subject p m increases. When the minimum required sample size n with the expected power q(α, p 1, p 2, n, p e, p m ) (2.6) greater or equal than a desired power 1 β at a nominal α-level is large, because the number of combinations of (n 1, n 2, n 3 ) under the multinomial distribution (2.1) can be very large, searching for the minimum required number n of subjects using a trial-and-error procedure will be extremely time consuming. To alleviate this problem, we propose

Sample Size Calculation with Random Loss 147 p e.1.3 p m.1.2.1.2 p 2.10.20.30.10.20.30.10.20.30.10.20.30 n One-Sided Test 20.078.035.013.063.028.011.204.096.041.168.080.035 30.162.068.025.135.057.022.350.155.061.303.136.055 40.229.098.037.194.084.032.481.215.078.425.188.070 50.288.125.048.258.110.042.591.273.096.532.242.087 60.348.152.057.313.134.051.681.327.112.622.291.101 70.407.178.065.362.158.059.753.378.126.697.338.115 80.458.203.073.413.181.066.811.428.141.760.383.128 90.507.228.081.458.203.073.856.475.156.811.427.141 100.553.253.088.502.225.080.892.519.171.851.469.154 120.634.300.102.581.268.093.940.597.198.910.546.180 140.701.345.116.649.310.105.967.666.226.947.614.204 160.758.389.129.708.350.117.982.725.253.969.673.229 180.805.431.142.758.389.129.991.775.279.982.725.253 200.844.470.155.800.426.141.995.816.305.990.769.276 n Two-Sided Test 20.053.019.006.041.015.005.129.054.022.100.043.018 30.104.036.011.082.029.009.245.094.033.204.079.029 40.152.054.017.129.045.015.362.137.044.310.117.039 50.204.073.024.178.062.020.470.180.054.411.156.049 60.258.092.029.221.080.026.566.224.064.503.194.058 70.305.111.035.267.096.030.649.269.075.585.233.066 80.356.130.040.311.113.035.720.312.084.657.273.076 90.400.149.044.355.130.040.778.355.094.720.312.084 100.446.168.049.395.147.044.826.397.104.772.350.093 120.530.206.059.475.181.052.895.476.124.852.424.111 140.603.245.068.547.215.061.938.549.144.906.493.129 160.668.282.076.611.249.069.965.615.165.942.556.147 180.724.320.085.668.282.076.980.673.185.964.614.165 200.772.356.094.718.316.084.989.724.206.978.667.183 Table 1: The exact power for testing the null hypothesis H 0 : p 1 = p 2 versus H a : p 1 > p 2 (one-sided test) or H a : p 1 p 2 (two-sided test) at 5% level, where p 1 = 0.40, and p 2 = 0.1, 0.2, 0.30; the probability of subjects falling into group 1, p e = 0.10, 0.30; the probability of excluding a randomly selected subject p m = 0.1, 0.20; and the total sample size n = 20 to 100 by 10, 120 to 200 by 20. using Monte Carlo simulation to generate 1000 repeated samples from the desired multinomial distribution (2.1). We then use the resulting empirical density for the random vector n rather than (2.1) when calculating the expected power (2.6). To further expedite this search procedure, we use the approximate sample size n a (2.9) as an initial estimate. If the power corre-

148 K. J. Lui and W. G. Cumberland sponding to n a (2.9) were less than the desired power, we would calculate powers using increasing sample sizes n{k} = n a + k max(int{n a /100}, 1), where max(v 1, v 2 ) denotes the maximum of v 1 and v 2, int{x} denotes the greatest integer x, for k = 1, 2,... until we first observe power greater or equal than 1 β. We note this sample size by n{k }. Similarly, if the power corresponding to n a (2.9) were larger than the desired power, we would then calculate powers using decreasing sample sizes n{k} = n a k max(int{n a /100}, 1) for k = 1, 2, until we obtain the first k such that the observed power q(α, p 1, p 2, n{k 1}, p e, p m ) < 1 β. Then, the minimum required sample size is again set equal to n{k }. Tables 2 and 3 summarize the approximate required sample n a (2.9), its corresponding power, and the final minimum required sample size estimate n{k } for one-sided and two-sided tests, respectively, for a desired power of 80% to reject the null hypothesis H 0 : p 1 = p 2 at the 0.05-level in the situations in which p 1 = 0.20, 0.30, 0.40, 0.50; p 2 ranges from 0.10 to p 1 0.10; p e = 0.10, 0.30, 0.50, 0.70; and p m = 0.10, 0.20. As shown in Tables 2 and 3, the power of using the approximate sample size formula n a (2.9) actually agrees reasonably well with the desired power 0.80 in almost all the situations considered here. For example, consider one of the worst cases for one-sided test: p e = 0.10, p m = 0.10, p 1 = 0.40, p 2 = 0.10 in Table 2. Here the corresponding power to the approximate sample size n a = 167 subjects (2.9) at 0.05 level (one-sided test) is 77.6%, that is less than the desired power 80% by only 2.5%. In this case, the final estimate of the minimum required sample size n{k } is 178 (Table 2). Similarly, when considering the same case as above for two-sided test (Table 3): p e = 0.10, p m = 0.10, p 1 = 0.40, p 2 = 0.10, the approximate sample size n a = 200, while the final estimate estimate n{k } is 214. 4 Discussion When we compare disease rates between subpopulations in sample surveys or response rates between treatments in clinical trials, this paper develops a sample size calculation procedure on the basis of the exact test for multinomial sampling with a random loss of subjects. If the required sample size is not large (less than 200), we can calculate the exact power (2.6) as those presented in Table 1 without any practical difficulty. These results not only provide us with an insight into the effects due to different parameters on the power, but also allow us to find the minimum required sample size for

Sample Size Calculation with Random Loss 149 p e 0.1 0.3 p m 0.1 0.2 0.1 0.2 p 1 p 2 n a n{k } n a n{k } n a n{k } n a n{k }.50.10 112 (.787) 115 125 (.787) 129 53 (.824) 51 59 (.818) 57.20 212 (.782) 224 238 (.784) 248 97 (.804) 96 109 (.804) 107.30 500 (.794) 510 563 (.793) 573 219 (.801) 219 247 (.803) 247.40 1989 (.798) 2027 2238 (.797) 2282 860 (.800) 860 968 (.801) 968.40.10 167 (.776) 178 188 (.773) 201 78 (.800) 79 88 (.799) 89.20 434 (.791) 446 488 (.795) 504 194 (.805) 191 218 (.806) 216.30 1823 (.798) 1841 2050 (.797) 2070 789 (.799) 796 888 (.799) 896.30.10 323 (.783) 338 363 (.783) 378 149 (.808) 147 168 (.808) 166.20 1489 (.792) 1531 1675 (.792) 1723 653 (.797) 659 734 (.797) 741.20.10 1000 (.783) 1050 1125 (.786) 1169 453 (.801) 453 509 (.799) 514 p e.5.7 p m.1.2.1.2 p 1 p 2 n a n{k } n a n{k } n a n{k } n a n{k }.50.10 45 (.822) 43 50 (.815) 49 55 (.825) 53 62 (.822) 60.20 83 (.810) 81 93 (.807) 92 99 (.808) 98 112 (.809) 111.30 185 (.802) 185 208 (.801) 208 222 (.802) 222 249 (.801) 249.40 723 (.801) 723 813 (.801) 813 862 (.801) 862 969 (.801) 969.40.10 72 (.830) 67 80 (.824) 76 85 (.821) 83 95 (.816) 92.20 165 (.807) 163 185 (.805) 183 199 (.807) 196 224 (.810) 220.30 667 (.800) 667 750 (.800) 750 800 (.802) 800 900 (.802) 900.30.10 129 (.813) 126 145 (.812) 142 159 (.820) 152 179 (.820) 171.20 558 (.802) 558 628 (.803) 628 674 (.806) 668 758 (.807) 751.20.10 394 (.810) 385 443 (.809) 435 480 (.813) 468 540 (.814) 525 Table 2: The approximate sample size n a (2.9), its corresponding power (in parenthesis), as well as the final estimate of the minimum required sample size n{k } for a desired power 0.80 of rejecting the null hypothesis H 0 : p 1 = p 2 at 5% level (one-sided test). a desired power 80% at a nominal 0.05-level. For example, for the case of p e = 0.1, p m = 0.10, p 1 = 0.40, and p 2 = 0.1, Table 1 shows that the required sample size for a desired power of 0.80 of one-sided test by linear interpolation is approximately 178 (= 180 20(0.05/0.47)). This is actually identical to the minimum required sample size estimate n{k } found using Monte Carlo simulation in Table 2. We also see that the estimated required sample size using either n a (2.9) or n{k } (Table 2) tends to reach the minimum as p e = 0.50. This is consistent with the well-known fact that given a fixed total sample size, equal sample allocation is generally optimal to maximize the power in comparison studies.

150 K. J. Lui and W. G. Cumberland p e 0.1 0.3 p m 0.1 0.2 0.1 0.2 p 1 p 2 n a n{k } n a n{k } n a n{k } n a n{k }.50.10 123 (.750) 137 138 (.750) 153 64 (.827) 62 72 (.826) 69.20 256 (.782) 270 288 (.775) 306 116 (.796) 117 130 (.794) 131.30 612 (.788) 636 688 (.789) 712 272 (.803) 270 305 (.802) 305.40 2489 (.796) 2513 2800 (.797) 2828 1075 (.799) 1085 1209 (.799) 1221.40.10 200 (.771) 214 225 (.775) 241 97 (.812) 95 109 (.813) 106.20 523 (.783) 548 588 (.782) 613 234 (.795) 236 263 (.796) 267.30 2267 (.794) 2311 2550 (.793) 2600 986 (.799) 995 1109 (.798) 1120.30.10 378 (.768) 408 425 (.770) 461 178 (.798) 180 200 (.797) 202.20 1845 (.788) 1917 2075 (.789) 2135 812 (.797) 820 913 (.796) 922.20.10 1223 (.781) 1295 1375 (.781) 1453 556 (.797) 561 625 (.797) 631 p e 0.5 0.7 p m 0.1 0.2 0.1 0.2 p 1 p 2 n a n{k } n a n{k } n a n{k } n a n{k }.50.10 56 (.837) 53 63 (.835) 59 67 (.831) 64 75 (.826) 72.20 100 (.805) 99 113 (.805) 112 123 (.815) 120 138 (.812) 135.30 229 (.801) 229 258 (.802) 258 276 (.805) 274 310 (.805) 310.40 907 (.801) 907 1020 (.801) 1020 1080 (.801) 1080 1215 (.800) 1215.40.10 85 (.821) 82 95 (.817) 92 106 (.832) 100 119 (.828) 113.20 203 (.805) 201 228 (.805) 226 247 (.811) 243 278 (.810) 272.30 836 (.801) 836 940 (.801) 940 1004 (.803) 1004 1129 (.804) 1129.30.10 160 (.818) 154 180 (.819) 174 198 (.828) 188 223 (.827) 211.20 696 (.802) 696 783 (.803) 783 844 (.808) 836 949 (.807) 940.20.10 487 (.809) 479 548 (.808) 538 600 (.817) 582 675 (.816) 651 Table 3: The approximate sample size n a (2.9), its corresponding power (in parenthesis), as well as the final estimate of the minimum required sample size n{k } for a desired power 0.80 of rejecting the null hypothesis H 0 : p 1 = p 2 at 5% level (two-sided test). Tables 2 and 3 demonstrate that using the approximate sample size formula n a (2.9) can actually agree well with the minimum required sample size estimate n{k } needed for a desired power in most situations. Thus, we can expedite the searching process for locating the minimum required sample size by using this approximate sample size n a as an initial estimate and applying the trial-and-error procedure. In summary, this paper has developed a sample size calculation procedure for a desired power 1 β at a given α-level for comparing the two independent proportions under multinomial sampling in the presence of random loss. This paper has presented an approximate sample size for-

Sample Size Calculation with Random Loss 151 mula and found that this approximation formula can be quite accurate in almost all the situations considered here. The results and the discussion presented here should be of use for biostatisticians, epidemiologists, and clinicians when they wish to employ multinomial sampling to collect subjects, but each of whom is subject to a random exclusion from studies. Acknowledgements The authors wish to thank the anonymous referee for many valuable comments to improve the clarity and scope of this paper, especially for the suggestion of the approximate sample size formula considered in this paper, and Ms. Ying Ying Ma for computational assistance in estimation of the required sample size. References Bennett, B. and Hsu, P. (1960). On the power function of the exact test for the 2x2 contingency table. Biometrika, 47:393 398. Bishop, Y., Fienberg, S., and Holland, P. (1975). Discrete Multivariate Analysis, Theory and Practice. MIT Press, Cambridge. Casagrande, J., Pike, M., and Smith, P. (1978a). The power function of the exact test for comparing two binomial distributions. Applied Statistics, 27:176 180. Casagrande, J., Pike, M., and Smith, P. (1978b). An improved approximate formula for comparing two binomial distributions. Biometrics, 34:483 486. Fisher, R. (1935). The logic of inductive inference. Journal of Royal Statistical Society, Series A, 98:39 54. Fleiss, J. (1981). Statistical Methods for Rates and Proportions, 2nd edn. Wiley and Sons, New York. Fleiss, J. L., Tytun, A., and Ury, H. K. (1980). A simple approximation for calculating sample sizes for comparing independent proportions. Biometrics, 36:343 346.

152 K. J. Lui and W. G. Cumberland Gail, M. and Gart, J. (1973). The determination of sample sizes for use with the exact conditional test in 2x2 comparative trials. Biometrics, 29:441 448. Gordon, I. (1994). Sample size for two independent proportions: a review. Australian Journal of Statistics, 36:199 209. Haseman, J. (1978). Exact sample sizes for use with the Fisher-Irwin test for 2x2 tables. Biometrics, 34:106 109. Irwin, J. D. (1935). Test of significance for differences between percentages based on small numbers. Metron, 12:83 94. Lui, K.-J. (1994). The effect of retaining probability variation on sample size calculations for normal variates. Biometrics, 50:297 300. Sahai, H. and Khurshid, A. (1996). Formulas and tables for determination of sample sizes and power in clinical trials for testing differences in proportions for the two-sample design: a review. Statistics in Medicine, 15:1 21. Skalski, J. (1992). Sample size calculations for normal variates under binomial censoring. Biometrics, 48:877 882. Yates, F. (1934). Contingency tables involving small numbers and the χ 2 test. Journal of the Royal Statistical Society, Supplement 1, pp. 217 235.