ONLY AVAILABLE IN ELECTRONIC FORM

Size: px

Start display at page:

Download "ONLY AVAILABLE IN ELECTRONIC FORM"

Silas Holland
5 years ago
Views:

1 OPERATIONS RESEARCH doi /opre ec pp. ec1 ec12 e-companion ONLY AVAILABLE IN ELECTRONIC FORM informs 2009 INFORMS Electronic Companion Index Policies for the Admission Control and Routing of Impatient Customers to Heterogeneous Service Stations by K. D. Glazebrook, C. Kirkbride, and J. Ouenniche, Operations Research, doi /opre

2 Contents of an on-line appendix to Index policies for admission control and routing of impatient customers to heterogeneous service stations K.D. Glazebrook Ψ, C. Kirkbride Ψ and J. Ouenniche Department of Mathematics and Statistics, Lancaster University, LA1 4YF, U.K. Ψ Department of Management Science, Lancaster University, LA1 4YX, U.K. School of Management, University of Edinburgh, Edinburgh, EH8 9JY, U.K. A1 Indices for service stations The reader is referred to the opening paragraphs of Section 4. We begin with a demonstration of the existence of a monotone policy which is optimal for P (W ). We shall use R u (W ) for the average net reward rate achieved by stationary policy u when the compensating reward for rejection is W D + C. Plainly, the system state process {N (t), t R + } evolving under stationary policy u is Markov. Recall the definition of B(u) in (13). Lemma A1 For stationary policies u 1, u 2 B(u 1 ) = B(u 2 ) = R u 1 (W ) = R u 2 (W ), W R. Proof The proof is in two stages. First, we show that under any stationary policy u, any state n > B(u) is transient and all states n B(u) are positive recurrent. Second, we use this information to calculate R u (W ). Consider any n > B(u). Define B(u, n) and B(u, n) as B(u, n) = max{n ; n < n and u(n ) = b}, B(u, n) = inf{n ; n n and u(n ) = b}. Note that we must have B(u, n) B(u), though B(u, n) may be infinite. Suppose that N (0) = n and define a modified state process under policy u in which state B(u, n) acts as a reflecting barrier with some (finite) occupancy rate. This modified process is restricted to {B(u, n), B(u, n) + 1,..., B(u, n)} and from C1 is evidently ergodic. Hence entry for the modified process to state B(u, n) is certain in finite time. The same must be true for the 1

3 original (unmodified) process under policy u. However u{b(u, n)} = b and so plainly once the process has entered B(u, n), any subsequent return to n under policy u is impossible. Hence n is transient. Suppose now that N (0) = n B(u). By the definition of B(u) the state process evolving under u is restricted to {0, 1,..., B(u)}. Standard theory indicates that the process is ergodic with stationary distribution given by Π u x = λ x {M(x)} 1 Π u 0, 0 x B(u), (A1) where and x M(x) = (µ y + θ y ) y=1 B(u) Π u 0 = λ x {M(x)} 1 x=0 1. (A2) Note that condition C1 guarantees that this stationary distribution exists even when B(u) is infinite. It follows that state n is positive recurrent. Note now that the stationary distribution {Π u x, 0 x B(u)} in (A1), (A2) only depends on the policy u through the value of B(u). It then follows that for stationary policies u 1, u 2 B(u 1 ) = B(u 2 ) = N R u 1 (W ) = R u 2 (W ) N = (R + C) µ x Π N x + (W D + C)λΠ N N x=1 and the result follows. The existence of an optimal policy for P (W ) in the monotone class is a trivial consequence of Lemma A1. We now proceed to give a series of results which demonstrate that the algorithm for index computation given in the paper following the proof of Theorem 1 is indeed correct. Before proceeding, we extend the definitions of the quantities W k and N k, computed in the algorithm as follows: Step 1 If N 1 =, we define W n = W 1, n 1 and N n =, n 1. Step k If N k =, we define W n = W k, n k and N n =, n k. We also introduce the constant W defined as { W = D C + (R + C)λ 1 inf (µ µ n ) (Π n n) 1}, (A3) n 0 and we set N 0 = 0. 2

4 Lemma A2 (i) {W n, n Z + } is a decreasing sequence of reals, bounded below by D C and for which N k < = W k > W k+1 ; (A4) (ii) {N n, n Z + } is an increasing sequence of members of N { }. Proof then For (i), suppose that N k <, W k W k+1 and obtain a contradiction. If N k < (W k D + C)λ(R + C) 1 = ( ) ( µ N k µ N k 1 Π N k 1 = max n N k 1 +1 N k 1 Π N k N k ) 1 { (µ ) ( n µ N k 1 Π N k 1 N k 1 Π n n ) 1 } (A5) where we take N 0 = 0 in (A5) for the case k = 1. However, W k+1 W k implies that ( ) ( ) 1 µ N k+1 µ N k Π N k N k Π N ( ) ( ) 1 k+1 N k+1 µ N k µ N k 1 Π N k 1 N k 1 Π N k N k and hence, by straightforward algebra, that ( µ N k+1 µ k 1) ( ) 1 N Π N k 1 N k 1 Π N ( ) ( ) 1 k+1 N k+1 µ N k µ N k 1 Π N k 1 N k 1 Π N k N k, (A6) where if N k+1 =, the l.h.s. of the two preceding inequalities are to be understood as limits. However, inequality (A6) contradicts the status of N k as the largest maximiser on the r.h.s. of (A5). This establishes (A4). That W k D C, k Z +, is easily seen from the construction of the sequence and the fact that {µ N, N N} is increasing. Part (i) is proved. Part (ii) follows trivially from the construction in the algorithm. This concludes the proof. We now describe the optimal monotone policies for P (W ), W R. Lemma A3 (i) Monotone policy 0 is strictly optimal for W > W 1 ; (ii) If N k < then monotone policy N k is strictly optimal for W k > W > W k+1, k 1; (iii) The monotone policy which accepts all incoming customers is strictly optimal for W > W >. Proof For (i), simply observe that from Step 1 of the index algorithm it follows that for W > W 1, (W D + C)λ = (W D + C)λΠ 0 0 > (R + C)µ n + (W D + C)λΠ n n = R n (W ), n 1 (A7) and hence monotone policy 0 is strictly optimal for this range of W. 3

5 We establish (ii) by induction. Suppose that N 1 < and observe from Step 1 that (W 1 D + C)λ = (R + C)µ N 1 ( 1 Π N 1 N 1 ) 1 (R + C)µ n (1 Π n n) 1, n 1. This in turn implies that (W 1 D + C)λ = R N 1 (W 1 ) R n (W 1 ), n 0. (A8) However, Π n n > Π N 1 N 1, n < N 1, and this together with (A8) trivially yields that R N 1 (W ) > R n (W ), n < N 1, W 1 > W > W 2. (A9) Note also that from the characterisation of W 2 in the algorithm we have that R N 1 (W 2 ) R n (W 2 ), n N (A10) However, Π n n < Π N 1 N 1, n N and this together with (A10) trivially yields that R N 1 (W ) > R n (W ), n N 1 + 1, W 1 > W > W 2. (A11) Combining (A9) and (A11) we infer the strict optimality of policy N 1 in the range W 1 > W > W 2. This establishes the inductive hypothesis for the case k = 1. We suppose now that the inductive hypothesis holds for j k 1 where k 2 and consider a set-up in which N k <. Strict optimality of policy N k 1 through the range W k 1 > W > W k together with the fact that N k achieves the maximum in the characterisation of W k yield R N k 1 (W k ) = R N k (W k ) R n (W k ), n 0. (A12) However, Π n n > Π N k N k, n < N k, which when combined with (A12) implies that R N k (W ) > R n (W ), n < N k, W k > W > W k+1. (A13) Further, from the characterisation of W k+1 we have that R N k (W ) > R n (W ), n N k + 1, W k > W > W k+1. (A14) Combining (A13) and (A14) we infer that monotone policy N k is strictly optimal for the range W k > W > W k+1. Hence the inductive hypothesis goes through and part (ii) is proved. For part (iii) we note that, when W > D C, W > W > D C = (R + C) (µ µ n ) (Π n n) 1 > λ(w D + C) = R (W ) > R n (W ), n N, and hence that policy is optimal in this W -range. That is also optimal when D C W is trivial. This proves (iii) and completes the proof of the result. Lemma A4 lim k W k = W. 4

6 Proof For the proof we shall suppose that N k <, k Z +. Cases for which N K = for some K < are dealt with similarly. Since, from part (i) of Lemma A2 the sequence {W n, n Z + } is decreasing with all members no less than D C, it must tend to a finite limit W, say. We suppose W W and obtain a contradiction. We consider two cases in turn. CASE 1: W > W. In this event there exists W such that ( W { D + C)λ > (W D + C)λ > (R + C) inf (µ µ n ) (Π n n) 1}. (A15) n 0 It follows from (A15) that for any such W there exists N < for which R N (W ) > (R + C)µ and hence that there exists Ñ < which maximises RN (W ) over N. However, since N k k (and hence N k diverges as k ) it follows from Lemma A3 that there must exist Ŵ > W > W such that the monotone policy ˆN is strictly optimal at Ŵ and ˆN > Ñ. However, the resulting inequalities yield the conclusion R ˆN(Ŵ ) > RÑ(Ŵ ) and R ˆN(W ) RÑ(W ) Π ˆṊ N > ΠÑÑ, ˆN > Ñ, which is inconsistent with the decreasing nature of {Π N N, N N}. contradiction. This is the required CASE 2: W < W. For this case, it follows from Lemma A3 and the discussion thereafter that when W = W there exists N < such that the monotone policy N is optimal. It follows that R N (W ) = (R + C)µ N + (W D + C)λΠ N N > (R + C)µ from which we conclude that { W D + C = (R + C)λ 1 inf (µ µ n ) (Π n n) 1} n 0 { (µ > (R + C)λ 1 µ N ) ( ) 1 } Π N N. (A16) Inequality (A16) yields the necessary contradiction in this case. This completes the proof. Comment Lemma A3 describes solutions to P (W ) when W lies in one of the open intervals (, W ) k 1 {(W k+1, W k )} (W 1, ). Suppose now that W = W k and that N k <. From Lemma A3(ii) it is plain that both monotone policies N k 1 and N k are optimal for W k. Now use Nk max to denote the full set of values achieving the maximisation in the definition of W k. It is straightforward to show that when W = W k, monotone policies n are optimal for all 5

7 n Nk max, as is monotone policy N k 1. No other monotone policies are optimal for W k. This statement, together with Lemma A3 gives a comprehensive account of the position regarding optimal monotone policies for all W, with the single exception of W = W. When W = W, there are two cases to consider. In the first, there is some K < for which N K 1 < and N K =, W K = W. As above, we use NK max for the full set of (finite) values achieving the maximum which yields W K. It is straightforward to show that when W = W K = W, monotone policies n are optimal for all n NK max as are the monotone policies N K 1 and. No other monotone policies are optimal. In the case in which N k <, k Z +, only monotone policy is optimal when W = W. We are now in a position to state that the Whittle indices are as asserted in (22) and (24). Corollary A5 The Whittle index W ( ) : N R is given as follows: (i) If N k <, k Z +, then W (n) = W k, N k 1 n < N k, k 1; (ii) If there exists K < such that N K 1 <, N K = then W k, N k 1 n < N k, K 1 k 1, W (n) = W K, N K 1 n <. Proof Consider first the case in which N k <, k Z +. From Lemma A3 and the above discussion we conclude that 0, W W 1, B(W ) = N k, W k > W W k+1, k 1, (A17), W W >. When there exists K < for which N K 1 < and N K = we have (supposing for the sake of presentation that K 2) that 0, W W 1, B(W ) = N k, W k > W W k+1, K 1 > k 1, N K 1, W K 1 > W W,, W > W >. (A18) In both cases B(W ) is decreasing in W as is required for indexability. That the Whittle indices are as claimed is established by simple reference to Definition 2 in the paper. This concludes the proof. 6

8 A2 Proof of Lemma 3 Noting the form of the index W (n) given in (25), we write where ( µ n+1 µ n) ( Π n n Π n+1) 1 = γ(n) {γ(n) + δ(n)} 1, γ(n) = n (µ x+1 µ x ) (Π x 0) 1 (A19) x=0 and δ(n) = n (θ x+1 θ x ) (Π x 0) 1. (A20) x=0 We write X = inf{x; µ x = µ x+1 }, and consider two cases in turn. If n X then from C2 we have µ n+1 = µ n and from (A19) and (A20) we conclude that γ(n + 1) = γ(n) and δ(n + 1) δ(n) and hence that γ(n + 1) {γ(n + 1) + δ(n + 1)} 1 γ(n) {γ(n) + δ(n)} 1. (A21) If n X 1 then it follows trivially from (A19), (A20) and C3 that δ(n){γ(n)} 1 (θ n+1 θ n )(µ n+1 µ n ) 1. It is then either true that n < X 1 and hence µ n+2 > µ n+1 and (θ n+2 θ n+1 )(µ n+2 µ n+1 ) 1 (θ n+1 θ n )(µ n+1 µ n ) 1 δ(n){γ(n)} 1, or that n = X 1 and hence µ n+2 = µ n+1. Either way, straightforward algebra yields from (A19) and (A20) that δ(n + 1){γ(n + 1)} 1 δ(n){γ(n)} 1 The result follows from (A21) and (A22). γ(n + 1){γ(n + 1) + δ(n + 1)} 1 γ(n){γ(n) + δ(n)} 1. (A22) A3 Notes on the performance bounds in Theorem 4 The performance guarantees given in Theorem 4 in the paper are derived from equation (34). This expression in turn is a consequence of the fact that our indices may be computed by a so-called adaptive greedy algorithm of the kind described by Bertsimas and Niño-Mora (1996) in the case of Gittins indices for branching bandit processes. What is needed here is an adaptation of the account of Niño-Mora (2001) for discounted reward restless bandits to our average reward models. First note that in the following account we shall assume without loss of generality that all stations are operated under FCFS. We shall write r m (n) for the 7

9 net expected return achieved by a customer admitted to station m (operating under FCFS) when in state (head count) n. We deploy elements of the polyhedral approach of Niño-Mora (2001) to develop an expression for r m (n) in terms of the sequence of Whittle indices for station m. Niño-Mora (2001) utilised a matrix A in his account of a finite state discounted reward restless bandit model to develop Whittle indices via an adaptive greedy algorithm. We use the device of setting θ m,n =, n Q, 1 m M, to develop a finite state approximation of our routing problem with state space {0, 1,..., Q 1} M. In order to deploy Niño-Mora s approach, we first apply a uniformisation to this finite state process and multiply transition rates by factor Θ, where ( ) M M Θ λ + µ m,q 1 + θ m,q 1 = 1. m=1 Under such a rate adjustment, station m in state n > 0 under active action (a station open) evolves to state n 1 at rate Θ(µ m,n + θ m,n ) and to state n + 1 at rate Θλ. Under the passive action (b station closed) evolution is to state n 1 at rate Θ(µ m,n + θ m,n ). Our concern is with the average reward criterion. We develop a matrix Ā from Niño- Mora s A by applying conventional limiting arguments (as discount rate approaches one) to our uniformised finite state process and by exploiting the particular structure of our routing problem. We proceed as follows: Fix some station (whose identifier we now omit) and set S(x) = {x, x + 1,..., Q 1} for some 0 x Q, with S(Q) being the empty set. Consider a corresponding admission control policy for the station in which the passive action (station closed) is taken in S(x) and the active action taken in {S(x)} c {0, 1,..., x 1}. We consider a cost for the station under such a policy which is simply the time for which the station is open. Write m=1 g{s(x)} g(x) = 1 Π x x, 0 x Q 1, (A23) for the average cost rate under policy S(x), namely the long-run proportion of time for which the station is open. Further, write ν{s(x), n} ν(x, n), 0 n, x Q 1, for the relative cost (or bias) associated with state n under policy S(x). A simple calculation which exploits the fact that the state process regenerates upon each entry into the empty state (head count 0) yields the identity ν(x, n + 1) ν(x, n) = g(x){θ(µ n+1 + θ n+1 )} 1, 0 x n Q 2. (A24) Following Niño-Mora (2001), the appropriate choice of matrix entries in Ā is that Ā(x, n) 1 should be the difference in expected bias at the next random event in the uniformised process between choosing active and passive actions now in current state n, all biases being computed with respect to policy S(x). It follows that Ā(x, n) = 1 + Θ(µ n + θ n )I(n 1) ν(x, n 1) + Θλ ν(x, n + 1) + {1 Θ(µ n + θ n )I(n 1) Θλ} ν(x, n) [Θ(µ n + θ n )I(n 1) ν(x, n 1) + {1 Θ(µ n + θ n )I(n 1)} ν(x, n)] 8

10 = 1 + Θλ{ ν(x, n + 1) ν(x, n)} = 1 λ(1 Π x x)(µ n+1 + θ n+1 ) 1, 0 x n Q 2. (A25) Matrix entries Ā(x, n) are now deployed together with conditional expected returns r(n) in an adaptive greedy algorithm to give an index representation (alternative to that in Theorem 2) for states {0, 1,..., Q 2}. To achieve this, we develop the sequence {y n, 0 n Q 2} as follows: y 0 = r(0){ā(0, 0)} 1 ; { } n 1 {Ā(n, } y n = r(n) Ā(x, n)y x 1 n), 1 n Q 2. x=0 The Whittle index for head count n is then recovered as n W (n) = D + y x, 0 n Q 2. x=0 (A26) (A27) (A28) From (A26) (A28) it follows that n r(n) = W (0) D Ā(x, n){w (x 1) W (x)}, 0 n Q 2. x=1 (A29) We now use the notation established in the introduction to Section 4 to extend equation (A29) in a way which yields an expression for the conditional expected return for all states of all stations. We write R(n ) for the conditional expected return upon entry for the state in position n of the full index ordering (i.e. with index W (n )). From (A29) we readily infer that R(n ) = W n (1) D A(n, n ) { } W (n 1) W (n), 1 n (Q 1)M + 1, n=2 (A30) where matrix A is as given in Section 4. However, all quantities in (A29) and (A30) are free of any dependence on the proposed loss rate modification θ m,n =, n Q, 1 m M. Hence (A30) extends to the range 1 n < in the obvious way. Consider now the system operating under some stationary policy u, say, for admission control and routing with FCFS operating within each station. Write π(n, u) for the long run proportion of time for which the n th highest index state is active under policy u. From (A30) the average reward rate earned under policy u is given by R u = λ = λ = λ n =1 [ [ R(n )π(n, u) W (1) D W (1) D { W (n 1) W (n)} n=2 ] Ā(n, n )π(n, u) n =n ] { W (n 1) W (n)}α(n, u), (A31) n=2 9

11 which coincides with (34) in the paper. However, none of the terms in (A31) depend upon the FCFS assumption and we infer that the formula gives a valid expression for R u for any discipline operating within the service stations. We now make some comments on the performance guarantees for the index policy given in Theorem 4. Comments on the performance bounds (36) (38) 1. The bound in (37) is similar in structure to the index-based performance guarantees which may be found, for example, in Glazebrook and Niño-Mora (2001) and Glazebrook, Niño-Mora and Ansell (2002). There they were used to explore the closeness-to-optimality of index policies for structured classes of decision processes related to Gittins multi-armed bandits. Indeed, Bertsimas and Niño-Mora (1996) argued the optimality of Gittins index policy for branching bandits by elucidating measures of low index usage, β(n, u) say, analogous to those in (33) for which it is true that β(n, index) = inf v β(n, v), n 2. (A32) Thus for branching bandits, index policies always make minimal usage of low index states in the sense of (A32) and hence are optimal. The bound in (37) is an appropriate quantification of the extent to which that fails to happen here. 2. The bound in (37) involves the values of a collection of admission control/routing problems, namely inf v α(n, v), n 2. (A33) An analysis similar to that in Sections 3 and 4 can be applied to these cost minimising problems. It can be shown for these problems that stations are guaranteed indexable. For the minimisation in (A33) the states {1, 2,..., n 1} (the first n 1 in the index ordering for problem (7)) all have index 1 while the remainder have index 0. It is thus true that the index policy for the original problem in (7) can also be regarded as an index policy for all the optimization problems in (A33). 3. We note in the paper that α(n, u) is decreasing in n for all u U. Our computational experience is that, for good policies and in many problem contexts, α(n, u) decreases to zero quickly. The values in Table A1, which apply to the λ = 3, θ = 0.3 case in the paper s Table 1 are fairly typical. n α(n, index) α(n, opt) inf v α(n, v) Table A1: Measures of usage of low index states deployed in the suboptimality bounds (36) (38). Details of problem given in text. 10

12 A4 Notes on Example 11 For definiteness, suppose that we have a two station instance of Example 2. Extend the index notation to W m (n, θ), m = 1, 2, to express dependence upon the common loss rate θ. Modify other notation similarly. Assume that W 1 (0, θ) > W 2 (0, θ) and fix ɛ > 0. We define n 1 (θ), n 2 (θ, ɛ) and m 1 (θ, ɛ) as follows: n 1 (θ) = max{n; W 1 (n, θ) > W 2 (0, θ)}; n 2 (θ, ɛ) = max{n; W 2 (n, θ) > W 2 (0, θ) ɛ}; (A34) (A35) and m 1 (θ, ɛ) = max[n; W 1 (n, θ) > W 2 {n 2 (θ, ɛ), θ}]. (A36) Consider the admission control/routing problem inf v α(n, θ, v) (A37) when n n 1 (θ) + 1. The nature of the matrix A together with the characterisation of n 1 (θ) in (A34) implies that (A37) considers a cost minimising problem in which the choice is between routing to station 1 and discarding at a cost of 1 unit. It follows easily that We further have from (A35) and (A36) that α(n, θ, index) = inf v α(n, θ, v), n n 1(θ) + 1. (A38) W {n 1 (θ) + 2, θ} W {n 1 (θ) + m 1 (θ, ɛ) + n 2 (θ, ɛ) + 2, θ} < ɛ. (A39) Utilising the fact that α(n, θ, u) is non-negative and decreasing in n for all u we infer that n=n 1 (θ)+m 1 (θ,ɛ)+n 2 (θ,ɛ)+3 { W (n 1, θ) W (n, θ)}{α(n, θ, index) inf W 2 (0, θ)p index {N 1 m 1 (θ, ɛ), N 2 n 2 (θ, ɛ)}, v α(n, θ, v)} (A40) where P index denotes a steady state probability under implementation of the index policy. We also infer, from the form of the indices for Example 2 in (28) that n 1 (θ), m 1 (θ, ɛ) and n 2 (θ, ɛ) all diverge to infinity as θ 0. It is straightforward to infer from the fact that λ < s 1 µ 1 + s 2 µ 2 that the quantity on the r.h.s. of (A40) goes to zero as θ 0. It now follows from Theorem 4 and (A38) (A40) that R opt (θ) R index (θ) λɛ + λw 2 (0, θ)p index {N 1 m 1 (θ, ɛ), N 2 n 2 (θ, ɛ)} λɛ, θ 0, (A41) for any fixed ɛ > 0. We infer that R opt (θ) R index (θ) 0, θ 0, as required. 11

13 A5 Supplement to the proof of Theorem 7 Recall from the paragraph following (50) that, if λ, index policy u W (λ) will admit a customer to the system if and only if the head count at some station m is no greater than X m. It will follow that in the limit λ, the proportion of time for which the head count at station m is X m + 1 under u W (λ) is one. We argue as follows: Suppose that λ, fix m and consider the limiting distribution of the head count process at station m. We bound this stochastically below by the limiting distribution of a process {Y m (t), t R + } as follows: The Y m -process takes values in {0, 1,..., X m + 1}. If Y m (t) = n {1,..., X m + 1} then a transition to n 1 occurs at Markov rate µ m,n + θ m,n exactly as in station m s head count process. However, an upward transition to n + 1 in the Y m -process can only occur at the conclusion of a period during which M n m X n consecutive arrivals (rate λ) occur uninterrupted by any downward transitions. Should such a downward transition occur, the quest for consecutive arrivals starts afresh. No upward transitions from state X m + 1 are possible. Note that, from any state the downward transition rates agree between the head count process under u W (λ) and the Y m -process while upward transitions occur more readily in the former. It is straightforward to obtain the limiting distribution of {Y m (t), t R + } explicitly. Denote it by {Π m (n, λ), 0 n X m + 1}, say. Direct calculations serve to show that Π m (n + 1, λ){π m (n, λ)} 1 = (µ m,n + θ m,n )(µ m,n+1 + θ m,n+1 ) 1 [ X] 1 1 {λ(λ + µ m,n + θ m,n ) 1 }, 0 n Xm, (A42) where X = M n m X n in (A42). It follows easily from (A42) that lim Π m( X m + 1, λ) = 1, λ and we conclude that in the limit as λ the proportion of time for which the head count at station m under u W (λ) is X m + 1 is one. 12

Dynamic Admission and Service Rate Control of a Queue

Dynamic Admission and Service Rate Control of a Queue Kranthi Mitra Adusumilli and John J. Hasenbein 1 Graduate Program in Operations Research and Industrial Engineering Department of Mechanical Engineering