B. Maddah INDE 504 Discrete-Event Simulation Output Analysis (3) Variance Reduction Variance reduction techniques (VRT) are methods to reduce the variance (i.e. increase precision) of simulation output without doing more runs. They are based on setting up the simulation in smart ways that benefit from correlations between different random variables to reduce variability. VRTs are not free though. E.g., some additional programming effort may be needed. The basic relation used in a VRT is 2 2 var( ax by) a var( X ) b var( Y) 2ab cov( X, Y). If the last term is negative than the variance of ax + by could be reduced. Recall that the sample covariance based on n observations of X and Y is cov( XY, ) n i 1 ( X X )( Y Y ) i n 1 i 1
Variance reduction using common random numbers (CRN) Thus is used when comparing the simulation outputs from two systems. The idea is to use the same stream of random numbers to compare both systems. Intuitively, this is equivalent to comparing the two systems under similar conditions. Suppose that n iid observations are available from the output of each system, X 11, X 12,, X 1n, and X 21, X 22,, X 2n. Consider the paired difference Z j = X 1j X 2j, j = 1,, n. Then, var( Z ) var( X ) var( X ) 2cov( X, X ) j 1 j 2 j 1 j 2 j This implies var(z j ) is reduced if X 1j and X 2j are positively correlated (i.e. cov(x 1j, X 1j ) > 0). This can be achieved by using the same stream of random number to estimate X 1j and X 1j. One critical point is synchronization. That is, using random numbers for the same purpose in both simulations. One way to achieve this is to use dedicated streams for each source of randomness (e.g. one for arrival times and one for service times.) 2
In addition, using a random variates generation method which uses 1 U(0,1) to get 1 variate X and which uses a monotone transformation U X helps in synchronization. The inverse transform method is highly desired here. CRM works well if the two systems under comparison react in a similar way to changes in the underlying random number streams. If not, we could get cov(x 1, X 2 ) < 0 and the method bacfires. Example 5 CRN was applied to the comparing the two ATM configurations in Example 1 based on 100 replications. CRN was applied in different ways by synchronizing arrival times only (A), service times only (S), and both arrival and service times. The results were as follows. (I means no synchronization e.g. using one random number stream for all purposes.) 3
Variance reduction using antithetic variates (AV) This method is used for variance reduction of a single system (in order to get a more precise output). Like CRN, AV works by recycling random numbers. The idea is to run replication j based on pairs of two replications. One replication, uses random numbers U j1, U j2,, U jm to estimate a measure of performance X (1) j, and the other run uses random numbers 1 U j1, 1 U j2,, 1 U jm, to estimate a similar measure X (2) j. Then, use X j = (X (1) j + X (2) j )/2 as the output of replication j. The use of U and (1 U) is sought to induce negative correlation between X (1) j and X (2) j. Then, the variance of X j is less than the variance of X (1) j and X (2) j since var(x j ) = [var(x (1) j ) + var(x (2) j ) + 2 cov(x (1) j, X (2) j )] / 4. The intuition behind AV is that counterbalancing a large observation with a small one leads somewhere close to the mean. Note that AV requires synchronization like CRN so that U and 1 U for the same purpose. 4
Variance reduction using control variates (CV) This method works by relating the random variable of interest, X, (for which we wish to estimate the mean) to another random variable with known mean, Y. E.g, in a queueing simulation, X could be the delay time and Y the service time. The idea is to use our knowledge of E[Y] = v to control X when X and Y are correlated. Specifically, define X C = X a(y v), Then, E[X C ] = E[X]. The choice of a can be made in a way that minimizes var[x C ]. Note that var[x C ] = var[x] + a 2 var[y] 2acov(X,Y). This is a second degree polynomial in a whose minimum is a* = cov(x, Y)/var[Y]. The issue now is how to estimate cov(x, Y) or even var[y]. This can often be only done through sample data, which tends to bias the estimation of E[X] through E[X C ]. 5