Dynamic Admission and Service Rate Control of a Queue

Size: px
Start display at page:

Download "Dynamic Admission and Service Rate Control of a Queue"

Transcription

1 Dynamic Admission and Service Rate Control of a Queue Kranthi Mitra Adusumilli and John J. Hasenbein 1 Graduate Program in Operations Research and Industrial Engineering Department of Mechanical Engineering University of Texas at Austin, Austin, Texas, {akranthi@gmail.com, jhas@mail.utexas.edu } Abstract This paper investigates a queueing system in which the controller can perform admission and service rate control. In particular, we examine a single server queueing system with Poisson arrivals and exponentially distributed services with adjustable rates. At each decision epoch the controller may adjust the service rate. Also, the controller can reject incoming customers as they arrive. The objective is to minimize long-run average costs which include: a holding cost, which is a non-decreasing function of the number of jobs in the system; a service rate cost c(x), representing the cost per unit time for servicing jobs at rate x; and a rejection cost κ for rejecting a single job. From basic principles, we derive a simple, efficient algorithm for computing the optimal policy. Our algorithm also provides an easily computable bound on the optimality gap at every step. Finally, we demonstrate that, in the class of stationary policies, deterministic stationary policies are optimal for this problem. April 13, Introduction This paper investigates a joint admission and service rate control problem for a single-server queue with Poisson arrivals and exponential service times. The controller is allowed to choose a service rate in each state and also whether to outsource (or reject) arriving jobs in each state. Thus the model allows service rate control in addition to a form of admissions control. There is a cost function associated with the available service rates, in addition to a rejection cost and a general holding cost. The controller s objective is to minimize the long-run average cost. There is a rich set of literature associated with this model and similar models, but the primary motivation for this work derives from an elegant analysis of the same problem, without admission control, performed in George and Harrison [4]. That paper specifically left open the problem of extending the model to incorporate admission control. Furthermore, [4] assumes that the optimal control is in the class of stationary deterministic control policies. In this paper, we extend their model to allow admission control (which also removes a technical 1 Research supported in part by National Science Foundation grant DMI

2 condition on the service rate cost function) and we prove that the optimal control policy is deterministic, within the class of stationary policies. The problem of controlling a queue with service rate or admission decisions has been addressed by many authors in various forms. Apart from [4], the most closely related paper is one by Stidham and Weber [10], who provided a uniform method for proving monotonicity of optimal rates in a variety of one station queueing problems. There are two primary differences between their model and ours. First, they did not explicitly consider the case of admission control, in the sense where customer can be rejected. Although they allow arrival rate control, this is not necessarily equivalent to the problem with admission control. Second, they require the action space to be compact, an assumption not imposed in our model. Koole [7] also provided a framework, called event-based dynamic programming to prove monotonicity results in queueing control problems of the type analyzed here. Koole s framework applies most directly to finite horizon discounted cost problems and it is possible that those results could be used to prove monotonicity results for the finite horizon version of our problem. However, extending the technique to particular infinite horizon average cost problems requires verifying technical side conditions. Furthermore, Koole s technique seems to rely on a uniformization, which would not apply in our model. The books of Sennott [9] and Puterman [8] also contain discussion and background on admission and rate control problems in queueing models. To our knowledge, the combined problem studied in this paper has not been analyzed previously. In particular, we allow the set of available service rates to be non-compact, which is different from the action space assumption in many standard Markov decision process (MDP) models, such as those in [2, 8, 9]. For example, [5], Guo and Hernández consider the overall action space to be unbounded, however, for each state, the action space is bounded. In our model, as in [4], the action space is allowed to be unbounded in each state. Ata and Shneorson [1], which extends the model of [4] to include capacity constraints, also consider an unbounded action space. There are three main contributions in this paper. First, we construct a simple, efficient iterative algorithm for solving the joint service rate and admission control problem for a single server queue. Although the system is relatively simple, the model is quite general in that there are minimal assumptions on the service rate and holding cost functions (see the next section for details). Furthermore, each step of the algorithm provides an immediately computable upper bound on the optimality gap. If the optimal policy is to reject customer at a system threshold level n, then the algorithm terminates when truncation scheme reaches level n. If it is not optimal to apply admission control then the algorithm does not terminate, but converges to the optimal control policy as the truncation level goes to infinity. Hence, in this case, the bound on the optimality gap is useful when applying a stopping criterion. Although our approach is similar in spirit to the algorithm in [4], the iterative approximation scheme is different. In particular, our algorithm truncates the state space, whereas the algorithm in [4] truncates the holding costs. The second contribution comes as a byproduct of the computational development, in which we also prove that the optimal service rates are monotone in the system state. The third contribution is to prove that deterministic policies 2

3 are optimal within the class of stationary policies. It remains to show that one can restrict the search for optimal policies to the class of stationary policies. Stidham and Weber [10] are able to show that this is the case when the action space is compact and their analysis relies crucially on this assumption. Hernández-Lerma and Lasserre [6] provide more general results results on the optimality of stationary policies for MDP models with unbounded action spaces. However, that book deals only with discrete-time Markov decision processes. Because our action space is unbounded, so are the transition rates, and thus uniformization cannot be invoked to convert our problem into an equivalent discrete-time problem. The rest of this paper is organized as follows. In Section 2 we introduce the control model. Section 3 presents the optimality equations and an associated verification theorem. Also, for a system with a fixed rejection threshold modified optimality equations are provided. The computational algorithm is presented in Section 4, and numerical examples are given in Section 5. Finally, in Section 6 we prove that the optimal control policy is stationary. 2 The Control Model Our model consists of a single server with an adjustable service rate. The service time of a customer being served at rate x > 0, is exponentially distributed with mean 1/x. Arrivals occur according to a Poisson process and without loss of generality we take the rate of this process to be 1. The system manager can change the service rate at any time. Further, an arrival can be denied admission by the system manager. There are costs associated both with providing service at a particular rate x and denying admission. These costs along with a holding cost comprise the total system cost. The objective of the system manager is to minimize the long run average cost per unit time. For practical applications, the rejection cost can be viewed as a cost to outsource jobs. It is assumed that all cost functions are known to the system manager. Let the cost per unit time to serve at rate x be c(x), and let h n be the cost per unit time to hold n customers in the system. In general, there can be a non-negative holding cost h 0 incurred even when no customers are present in the system. The cost incurred for rejecting a customer is κ. One can also view the rejection of jobs as providing instantaneous service. As will be seen below, this view is useful when considering the technical conditions on the cost of service function c( ). In summary, the system manager s decision policy consists of determining the service rates and admission policy at any instant of time, with the objective of minimizing the long-run average cost of operating the system. Under such an objective function, standard arguments show that the decision time points can be restricted to times when arrivals or departures occur. Thus, the control problem can be embedded in the framework of a continuous-time Markov decision process with a countable state space. In Section 6 we show that one need only consider the set of stationary, state-dependent controls. We now discuss the technical assumptions in more detail. We make the following assumptions on the action space and system costs: (A1) The action space A is a closed subset of [0, ) containing 0 and an element greater 3

4 than 1. (A2) The holding cost h n is non-decreasing in n. (A3) The cost for service function c(x) is non-decreasing in x. (A4) c( ) is continuous on (0, ) (and right-continuous at 0) with c(0) = 0. These assumptions are relatively weak and are essentially the minimal assumptions required to avoid pathological cases. They are also identical to the assumptions that appear in the motivating paper of George and Harrison [4], with some exceptions. First, George and Harrison require only left-continuity of c( ). Our computational approach requires the additional assumption of right-continuity. Second, they also require a geometric growth condition on the holding costs (see equation (2) in [4]). This condition was imposed primarily in order to prove convergence of optimal values under their algorithmic method, which involves truncating holding costs. Since our truncation method is different, we do not require this geometric growth condition. Finally, and most importantly, in [4], the following additional condition is imposed on the cost function when the action space A is unbounded: lim y ( inf { c(x) x }) : x A, x y =. If the limit above is finite, then the control model must take into account a possible additional mode of control: instantaneous service of a customer, for a cost equal to the limit. George and Harrison wished to exclude that mode of control from their analysis. We allow this mode of control, and thus the relaxed assumption below is imposed: (A5) lim y (inf { c(x) x }) : x A, x y κ 0. Recall that κ is the cost to reject (or outsource) a customer. Assumption A5 implies that the cost for service per unit time is greater than or equal to the rejection cost, as the service rate grows without bound. When A5 is not satisfied, the rejection option can be ignored, since serving at some large service rate is always more beneficial than rejecting the customer. A5 is similar in form to the condition for near monotone costs discussed in Borkar and Meyn [3]. When this conditions holds, they are able to show existence of optimal policies under the so-called risk-sensitive cost criterion. The five assumptions A1 A5 are the only technical assumptions needed for our analysis. Since all interevent times are exponential in our model, the state space can be restricted to simply the current system size. The action space at each decision epoch involves the admission decision and the service rate control. Note that admission control is only relevant when the decision epoch is at a customer arrival time (as opposed to a departure time). We represent the admission control decision by a, where a {0, 1}. a = 1 when the decision is to admit the new customer and a = 0 when the decision is to reject. The service rate can be changed when the system state changes due to a departure or an arrival, even if the customer is not admitted. We assume that the admission and service rate decisions are made 4

5 simultaneously by the controller. Thus, at an arrival time the joint control decision can be represented as (a, x), where the service rate until the next decision epoch is 0 x <. When the state changes due to a departure, then we represent the action as (1, x). Without loss of generality we assume x = 0 when n = 0. For most of this paper, we restrict our analysis to the set of deterministic stationary controls. Under a given stationary control, it is clear that the system size process is a continuous-time Markov chain (CTMC). If the Markov chain is positive recurrent under a policy, we call the policy ergodic. 2.1 Dynamical Equations and Objective Functions In this section we assume that the system is operated under a stationary ergodic policy. For simplicity of discussion we also assume that the system starts empty at time 0, although the results hold for any initial state. Since the control is stationary, if incoming customers are rejected when the system size is m, then the system size will never exceed m. Hence the set of controls can be divided into two general types: terminating and non-terminating. Under a given terminating policy we let m be the smallest state for which customers are rejected. For such a policy, we can specify the policy by ( µ, m). When there are n m customers in the system the server processes jobs at rate µ n. When there are m customers in the system, all incoming customers are rejected. We refer to such a policy as m-terminating when the threshold level m needs to be emphasized. If a given stationary policy does not reject arrivals in any state then the policy is non-terminating and the policy can be specified by an infinite vector µ. In this case, we think of m as taking the value infinity. Under an ergodic policy ( µ, m), let p n ( µ, m) be the steady-state probability that the system size is n. Standard CTMC theory then implies that the long-run average cost per unit time under this policy is: z( µ, m) := p m ( µ, m)κ + m p n ( µ, m) {c(µ n ) + h n }. (1) n=0 For a non-terminating policy, with m =, we take p m ( µ, m) = 0. Hence, in this case the first term on the right-hand side of (1) is zero, and the sum is an infinite sum. The steady-state probabilities satisfy the local balance equations: p n ( µ, m) = p n 1 ( µ, m)µ 1 n 1 n m, µ n > 0. (2) It is possible to not serve customers in some states, i.e., to have µ n = 0 for some n. In this case, the state space and balance equations are modified in a straightforward manner. Note that in order for a non-terminating policy to be ergodic the number of states with µ n = 0 has to be finite. Next, define z (m) := inf z( µ, m), where the infimum is taken over all m terminating policies. Note that z(0, 0) = z (0). If customers are rejected in every state, the resulting policy is ergodic with z(0, 0) = h 0 +κ <. 5

6 Thus, for every set of parameters there exists at least one ergodic policy with a finite average cost. As a result, the infimum above is well-defined and finite. An m terminating policy with z( µ, m) = z (m), if it exists, is called m-optimal. The infimum among all ergodic policies is: z = inf m 0 z (m). The control problem is to find a ergodic policy ( µ, m) which achieves the infimum and a policy which does so is said to be globally optimal. We use this term to distinguish such policies from policies which are only optimal among all m-terminating policies, for a given m (we sometimes call such policies m-optimal ). When holding cost rates are bounded, it is possible that h n h z <, i.e., the long-run average cost under the do-nothing policy is smaller than achievable long-run average cost under any ergodic policy. As in [4], in this case the MDP is said to be degenerate. When the problem is non-degenerate, we prove existence of the optimal policy in a constructive manner; in particular, we provide an algorithm to compute the policy. 3 Optimality Equations and Verification In this section, we provide the optimality equations and an associated verification theorem. The overall procedure is similar to that in [4]. However, several modifications are necessary due to the extra mode of control allowed. Standard arguments from the theory of Markov decision processes yield the following equations for the relative cost functions, which we denote by v n, n 0: and v n = min 0 = min inf x A inf { c(x)+hn z+xvn 1 +v n+1 1+x { c(x)+hn z+xvn 1 +v n+κ x A 1+x { c(x)+hn z x(vn v n 1 )+(v n+1 v n) 1+x { } inf c(x)+hn z x(vn v n 1 )+κ x A 1+x inf x A v 1 = v 0 h 0 + z. }, } n 1, }, n 1 (3a) Since for a given z the sequence of v n s is determined only up to an additive constant, one usually works with the relative cost differences y n := v n v n 1, n 1. Following the arguments in [4] and [11], using the relative cost differences reduces the optimality equations to the following form: and z h n = min ( inf x A {c(x) y nx + y n+1 } inf x A {c(x) y nx + κ} y 1 = z h 0. 6 ) (3b) n 1, (4a) (4b)

7 Defining φ(y) := sup {yx c(x)}, for y 0, (5) x A simplifies the optimality equations. Specifically, (4a) becomes h n z = φ(y n ) min {y n+1, κ} n 1. (6) It is worthwhile to note that the optimality equations remain the same even if the holding cost is not non-decreasing. The smallest value of n for which y n+1 κ is said to be the terminating state for the optimality equations. This corresponds to the threshold state m of the associated terminating policy. If y n+1 < κ for all n 1, then the solution pair is said to be non-terminating. Note that if a particular pair z and (y 1, y 2,... ) is a solution of the optimality equations (4b) and (6), then y n+1 < κ for any non-terminating state n 1. When y < κ, under A1, A4 and A5, the function φ( ) has finite values which are attained and the smallest maximizers in the set A exist. Let ψ(y) be the smallest minimizer of φ(y). Note that assumption A5 implies that ψ(y) is finite for each y 0. Further, if φ(κ) <, then ψ(κ) is well defined. In the next subsection we confirm the validity of the optimality equations for bounded solutions (sometimes called a verification theorem ). Also, a modified version of these optimality equations is introduced and verified. 3.1 Verification Theorem We first state and prove the main verification result. Theorem 1. Let z <, (y 1, y 2,... ) be a solution to the optimality equations (4b) and (6), with the y i s being uniformly bounded. Let m be the corresponding terminating state (if there is no terminating state, then m = ). Then z z( µ, m) for every ergodic policy ( µ, m), that is, z z. If the policy ( µ, m ) defined by { µ ψ(y n ) for 1 n < m ; n = ψ(y m ) for n m (when m < ) is ergodic, then ( µ, m ) is an optimal policy. Proof. First, note that since z < and the y i are bounded, φ(y n ) < for all n 1. Using (5) and (6), one obtains the following relations: xy n c(x) φ(y n ) y n+1 + h n z x A, n 1, (7) xy n c(x) φ(y n ) κ + h n z x A, n 1. (8) Any ergodic policy is either a terminating or a non-terminating policy. First, consider an arbitrary terminating policy ( µ, m). Setting x = µ n in (7) and x = µ m and n = m in (8), we have µ n y n c(µ n ) y n+1 + h n z for n 1, (9) µ m y m c(µ m ) κ h m z for m <. (10) 7

8 Multiplying both sides of (9) by p n ( µ, m), multiplying both sides of (10) by p m ( µ, m), substituting µ n p n ( µ, m) = p n 1 ( µ, m) from (2), and rearranging terms yields: p n ( µ, m)[h n + c(µ n ) z] p n 1 ( µ, m)y n p n ( µ, m)y n+1 for 1 n < m (11a) p m ( µ, m) [h m + c(µ m ) + κ z] p m 1 ( µ, m)y m. Summing all the equations in (11a) and (11b), and using relation (1) gives (11b) z( µ, m) p 0 ( µ, m)h 0 z[1 p 0 ( µ, m)] p 0 ( µ, m)y 1. (12) Applying (4b) we conclude z( µ, m) z, which establishes the result for any terminating policy. For a non-terminating policy ( µ, ), the derivation is analogous, where now we sum the equations in (11a) over all n 1. Since the y i s are bounded the sums due to the right-hand side of (11a) are finite and we again obtain (12). This establishes z( µ, ) z for any non-terminating policy. Next, given a solution to the optimality equations with m <, set x = ψ(y n ) in (7) and x = ψ(y m ) in (8). In that case, (9) and (10) hold with equality, implying that (11a) and (11b) also hold with equality. As a result (12) holds with equality for this policy, i.e., (µ, m ) is optimal. An analogous argument holds in the non-terminating case. In the theorem above, even for a terminating policy, we assigned a service rate for states beyond the threshold state. Of course, these states are transient, so the assigned rates are inconsequential in terms of long-run costs. In the theorem below, we consider the optimality equations for a fixed n-terminating policy. These equations will come in to play in later sections. As such, we refer to (4b) and the equations in (13) as the n-optimality equations. Furthermore, we call a policy n-optimal if it is optimal for the control problem in which the state space is truncated to be {0,..., n}. The following theorem is stated without a proof, as it follows directly from arguments in the proof above. Theorem 2. If there exist an n 1,and a sequence (y 1,, y n ) and z(n) < satisfying (4b) and then z(n) = z (n) and the policy ( µ, n) given by is n-optimal. h k z(n) = φ(y k ) y k+1 for 1 k n y n+1 = κ (13) µ k = ψ(y k ) for 1 k n (14) This theorem can be applied when a system with a fixed threshold is considered and there is no option of rejecting customers unless the buffer is full. One such case is discussed in [1]. Note that both theorems in this section hold even when the holding cost h n is not non-decreasing in n. 8

9 4 Policy Computation 4.1 Overview In Section 3.1 we developed optimality equations and established their validity. In this section, we suggest a computational methodology to obtain an optimal or near optimal policy. The optimality equations alone do not suggest an efficient methodology for constructing near optimal policies. However, the structure of the model does suggest an algorithm, and most of this section is devoted to validating the approach. It should be noted that our algorithm differs from the one suggested in [4]. In their model, approximating problems are formed by truncating the holding costs at some buffer level n, i.e., they set h n = h n+1 =... for some n 1. Since our model allows customer rejection, it is natural to consider a sequence of approximating problems which truncate the state space instead. This approach works because it can be shown that if there is a local minimum in these approximating problems, then the solution to the corresponding truncated problem yields the optimal policy for the original problem. If no local minimum exists, then it is not optimal to reject customers in any state. In either case, one can use the approximating problems to generate a bound on the optimality gap at any stage, thus allowing the implementation of a stopping criterion. Because we allow the action space to be unbounded, there may be no formal solution to the optimality equations for a particular truncation level n. Such cases arise when the optimal service rate for a state below the truncation level is. In such cases, as the algorithm below indicates, it is then globally optimal to reject at a lower threshold value. Assumption A5 insures that such a policy is optimal in these cases. The algorithm is as follows: Initialization Set n = 1. Step 1 Solve the n-optimality equations. Step 2 If a solution to the optimality equations exists and z (n) z (n 1) then the optimal (n 1)-terminating policy is globally optimal. If no solution exists, then the optimal (n 1)-terminating policy is globally optimal. If neither case applies, then increase n by 1 and go to step 1. Consider the sequence of solutions (z(n), (y n 1,, y n n)), n 1, satisfying equations (4b) and (13). If a solution pair exists for some n 1 then Theorem 2 applies, i.e., z(n) = z (n). In particular, this solution pair corresponds to an optimal policy for a system with a limit of n customers. Starting from n = 1 solution pairs are computed for incremental values of n. As mentioned above, this computation continues until a local minimum is found or until no solution pair exists for some n. It is shown below that if ñ is the first local minimum of the sequence then z(ñ) = z. If there is no such local minimum, then lim z(n) = n z. 9

10 The sequence of terminating optimal policies can be seen as corresponding to progressive approximations of an optimal policy in the same way that [4] provides successive approximations via holding cost truncation. We now summarize the development in the remainder of this section. In Section 4.2 we prove optimality of the n-terminating policy corresponding to the first local minimum of the z( ) values described above. Section 4.3 treats the case when there is no local minimum. In that case, the limiting policy is shown to be optimal. In both cases, it is shown that the optimal service rates are monotone in the number of customers in the system. 4.2 Policy Computation: The Local Minimum Case The sequence of solutions (z(1), y 1 1), (z(2), y 2 1, y 2 2),... might have a local minimum or terminate when no solution pair exists for a particular n + 1. When no solution exists it is shown that z(n) < z (n+1) (note that z (n+1) exists even if there is no solution z(n+1)). Hence, in either case the sequence is said to have a local minimum. Here, we show that when z(n) is the first local minimum of the sequence, then the n-optimal policy is an optimal policy. We first establish a preliminary lemma. Lemma 1. Consider a sequence of optimal terminating solutions (z(1), y 1 1), (z(2), y 2 1, y 2 2),.... If z(k) > z(n) for 1 k < n, then y n k+1 < κ. (15) Proof. Since z(k) > z(n), we have h k z(n) > h k z(k) φ(y n k ) y n k+1 > φ(y k k) κ, (16) where the latter inequality is due to (13). Next note that given a value z(n), the sequence of y s are constructed from (4b) and (13). This construction directly implies that if z(k) > z(n) then yk k > yn k. Since φ( ) is a non-decreasing function we have φ(yk k ) φ(yn k ). Combining this with (16) yields yk+1 n < κ. For a decreasing sequence of z s Lemma 1 establishes that the y s are bounded away from κ, i.e., they satisfy the defining property of non-terminating states. The next theorem is the main result of this subsection. Theorem 3. If there exists a sequence of optimal terminating solutions with values satisfying z(k) > z(n) for 1 k < n, z (n + 1) z(n) (17) then the n-optimal policy corresponding to the solution (z(n), (y n 1,, y n n)) is globally optimal. Furthermore y n 1 y n 2 y n n, (18) which implies that the optimal service rates are monotone in the number of jobs in the system. 10

11 Proof. By assumption, n 1. The case where even the 1-optimality equations cannot be satisfied is discussed at the end of this subsection. We first establish (18). For any l, such that 1 < l n we can apply (13) and the fact that the holding cost rates are non-decreasing to obtain: φ(y l l) κ = h l z(l) h l 1 z(l) φ(y l l 1) κ, where the last inequality is derived by applying both (13) and Lemma 1. Since φ( ) is non-decreasing, the inequality above implies yl l yl l 1 and another application of (13) yields φ(y l 1 ) h l 1 φ(y l l 2) h l 2. (19) Since the holding costs are non-decreasing (19) implies φ(yl 1 l ) φ(yl l 2 ), which gives yl l 1 yl 2 l. We can now recursively apply the last few observations to obtain (18). We are now prepared to prove the main part of the theorem. By Lemma 1, yk+1 n < κ for 1 k < n. Thus, the optimality equations in (13) from the terminating case can be written in the equivalent form of the original optimality equations: h k z(n) = φ(y n k ) min { y n k+1, κ } for 1 k < n. (20) Case 1: Suppose that there exists a solution to the set of (n + 1)-optimality equations. As previously noted, if z(n) z(n + 1), then by construction yk n yn+1 k for all 1 k n. Further, we also have h n z(n) h n z(n + 1). Applying the n- and (n + 1)-optimality equations, plus the last two observations gives φ(yn) n κ φ(yn n+1 ) yn+1 φ(yn) n yn+1 yn+1 κ. (21) Define δ := z(n + 1) z(n) and, for k n + 1 set g := h n+1 δ. Using these definitions, (13) and (21) yields: g z(n) = φ(y n+1 n+1) min { y n+1 n+1, κ } for k n + 1. (22) Let us modify the holding cost vector by replacing h i, i > n with g, i.e., the new rates are: (h 0, h 1,, h n 1, h n, g, g, ). Since δ 0 the new holding cost rates for states larger than n are less than or equal to the original rates. Then, for this cost rate vector, the n-optimal policy corresponding to the solution (z(n), (y n 1,, y n n)) is globally optimal using (4b), (20) and (22) and applying Theorem 1. Since it is optimal to reject customers in state n with these modified holding 11

12 costs it must also be optimal to reject them when all holding costs are larger beyond state n, as they are in the original holding cost vector. Hence, the n-optimal policy is also globally optimal under the original holding cost vector. Case 2: Suppose that a solution to the set of (n + 1)-optimality equations does not exist. In state n+1, replace the holding cost h n+1 with the modified cost h n+1 := z(n)+φ(κ) κ. We claim that h n+1 h n+1 0. To see this, first note that for the system with the modified cost, (z(n), (y n 1,, y n n, κ)) satisfies the (n + 1)-optimality equations. Now suppose that h n+1 h n+1 < 0. Let ẑ(n+1) be the cost of implementing the policy corresponding to (y n 1,, y n n, κ) with the original holding cost vector. In that case, we must have ẑ(n + 1) < z(n), since the only thing that has been changed is the holding cost in state n + 1, which by assumption is strictly lower in the original system. This last inequality implies z (n + 1) < z(n) which contradicts the assumption in the theorem statement that z (n + 1) z(n). Thus we have established that h n+1 h n+1 0. The remainder of the argument is similar to that of case 1. In particular we again modify that holding cost vector to be (h 0, h 1,, h n 1, h n, h n+1, h n+1, ) and argue that the n-optimal policy corresponding to the solution (z(n), (y1 n,, yn)) n is globally optimal in the modified system. Hence, it must also be globally optimal in the original system. It is possible that there is not even a solution to the 1-optimality equations, i.e., there is no z and finite y 1 satisfying the equations. In this case, it is straightforward to argue that the policy of rejecting customers in all states (including state 0) is optimal. 4.3 Policy Computation: The Decreasing Sequence Case If there exists a sequence of terminating optimal solutions, satisfying z(n) > z(n + 1) n 0, (23) then z( ) := lim n z(n) is the cost incurred by using the limiting control policy, i.e., the policy ( ζ, ) where ζ m = lim n ψ(y n m) for all m 1. From the discussion preceding Section 3.1 it is clear that the limiting service rate ζ m is finite for all m 1. Thus, to prove optimality of the limiting control policy we need to show that a limiting policy exists and that z( ) = z. The theory in this section is closely related to the truncating holding cost case considered in Section 6 of [4]. So, consider a modified problem with holding costs (h 0, h 1,, h n 1, h n, h n, ). The next lemma shows that there exists an n-terminating solution for this truncated holding cost problem. This result is used in establishing optimality of the limiting control policy. 12

13 Lemma 2. Suppose there exists a sequence of n-terminating solutions {(z(n), (y n 1,..., y n n), n 1} with decreasing values, i.e., (23) is satisfied. Then for each n 1 there exists a solution vector (ẑ(n), (ŷ n 1,, ŷ n n)) which solves the truncated holding cost problem. In particular (ẑ(n), (ŷ n 1,, ŷ n n)), satisfies: (i) (4b) and ŷ n k = φ(ŷ n k 1) h k 1 + ẑ(n) for 2 k n (24) ŷ n n = φ(ŷ n n) h n + ẑ(n). (25) (ii) Furthermore, for each n 1 the ŷ n s are non-decreasing: ŷ n 1 ŷ n 2 ŷ n n. (26) Proof. Consider an n-terminating solution (z, (y 1,, y n )). Such a solution satisfies (4b) and (24), but not necessarily (25). We wish to show that we can find a ẑ(n) which will generate a new sequence of ŷ n s which also satisfy (25). Note that for any fixed z if a sequence of y n s exists it is unique. Hence, below we can think of y n as a function of z and we denote this below by using the notation y n (z). Define n (z) := φ(y n (z)) y n (z) + z h n (27) and let S(n) represent the statement that there exists a ẑ(n) such that n (ẑ(n)) = 0. If for some n 1, the original n-terminating solution is such that, n (z(n)) 0, then this solution satisfies the global optimality equations (4b) (6), implying that the corresponding policy is globally optimal. This contradicts our assumption that the sequence of n-terminating optimal policies have decreasing values. Therefore, n (z(n)) > 0, n 1. For n = 1, plugging in z = h 0 in (27) and using the fact that the holding cost is nondecreasing, we have 1 (h 0 ) = φ(0) + h 0 h 1 0. By construction, it can be seen that 1 ( ) is continuous on (0, ). Hence, from the observations in the last two displays we conclude that there exists a ẑ(1), h 0 ẑ(1) < z(1), with 1 (ẑ(1)) = 0, i.e., S(1) holds. Now assume S(m) holds for some m 1. Let ẑ(m) be the function argument for which S(m) holds, i.e., suppose m (ẑ(m)) = 0. Next, by definition m+1 (ẑ(m)) = φ(y m+1 (ẑ(m)) y m+1 (ẑ(m)) + ẑ(m) h m+1. (28) Using (28), the identity y m+1 (ẑ(m)) = y m (ẑ(m)) and the fact that S(m) holds we have: m+1 (ẑ(m)) = h m h m+1, 13

14 which implies that m+1 (ẑ(m)) is non-positive. From observations above recall that m+1 (z(m+ 1)) > 0. Again using the continuity of m+1 ( ) we conclude that there exists a ẑ(m + 1), ẑ(m) ẑ(m + 1) < z(m + 1), (29) such that m+1 (ẑ(m + 1)) = 0. So, S(m + 1) holds and by induction, S(n) holds for all n 1, thus establishing (i). Result (ii) follows from an argument analogous to that in the proof of Theorem 3. For completeness, we now state a lemma similar to Proposition 4 in [4]. The proposition establishes that the solutions of Lemma 2 correspond to optimal policies, at least when the control problem is nondegenerate. Lemma 3. For a fixed n 1, consider the control problem with modified holding cost vector (h 0,, h n 1, h n, h n, ). Let ẑ(n) be the optimal objective value for the modified problem, and let (ˆµ n 1, ˆµ n 2,...) be the corresponding optimal service rates. The following hold: (i) If ẑ(n) h n, then the modified control problem is degenerate. (ii) If ẑ(n) < h n, then ( η(n), ), where η(n) = {ˆµ n i } is an optimal ergodic policy. Proof. For clarity note that the elements of η(n) are dictated by ẑ(n), and ˆµ n k = ˆµn n for k n. The result follows from Proposition 4 in [4] and observing that (ẑ(n), (ŷ1 n,, ŷn)) n satisfies (4b), (24) and (25). If the optimal value for a terminating policy is strictly smaller than the previous value, i.e, z(n) > z(n+1), then Lemma 3 guarantees the existence of a non-terminating optimal policy for a system with holding cost truncated at h n. This result holds even if z(n + 1) z(n + 2), i.e., the sequence of z( ) s need only be decreasing up until stage n + 1 for the existence of an optimal policy under truncated holding costs. Recall that the original holding costs are non-decreasing in the number of jobs in the system. Hence, as we increase the truncation level, the sequence of ẑ(n) s must also be non-decreasing. Furthermore, we have ẑ(n) z for all n 1. Hence, ẑ( ) := lim n ẑ(n) z. (30) Next note that the ŷ n i ( ) are bounded and increasing functions of ẑ. From these properties, (26), and the construction of non-terminating policies in Lemma 2, we have for i 1 ŷ i := lim n ŷ n i Similarly, the pre-limit property (26) of the ŷ n i = ŷ i (ẑ( )) yields ŷ 1 ŷ 2 Since ψ( ) is continuous and non-decreasing, (30) implies the existence of a limiting control rate, for each i 1: ˆµ i := lim ψ(ŷi n ) = ψ(ŷi ). (31) n With these observations above in hand, we first consider the degenerate case. 14

15 Theorem 4. If h n h as n and ẑ( ) h, then the original control problem is degenerate. Proof. The result follows from (30) and the definition of degeneracy. In the appendix we prove the following lemma, from which the main result will follow immediately. Lemma 4. For every non-terminating ergodic policy ( η(n), ) construct a terminating policy ( µ n, n), defined by { µ n ˆµ n k if 1 k < n; k = ˆµ n n 1 if k = n. Let z(n) be the optimal values of these so-constructed policies. Then lim z(n) = n z. We are now prepared to present the main result of this section, which provides the justification for the computational method when optimal objective values are decreasing. Theorem 5. If the sequence of n-terminating optimal policies have decreasing values, i.e., (23) is satisfied, then the corresponding limiting control policy ( µ( ), ) is an optimal policy. Furthermore, the optimal service rates are non-decreasing with respect to the number of customers in the system. Proof. First, we wish to establish that the sequence of values of optimal terminating policies converges to z. In Lemma 4 we establish this for terminating policies derived from the sequence of policies derived from truncated holding cost problems. Recall that these optimal values are denoted by z(n), n 1. It is easy to see that for all n, we have z(n) z(n), since z(n) is the optimal value among all terminating policies, and z(n) corresponds to a value under some (perhaps suboptimal) terminating policy, at the same truncation level. Lemma 4 shows that lim n z(n) = z, which with the observation above immediately implies that lim z(n) = n z. From here on, our argument is analogous to that in the proof of Proposition 7 in [4]. In particular, y( ) and ψ( ) are continuous functions implying that the optimal truncated service rates converge to a limiting service rate vector corresponding to z : lim ψ(y k(z(n)) = µ k ( ) = ψ(y k (z )), n for all k 1. These same continuity properties similarly imply that the optimal limiting service rates are monotone in the number of customers in the system. 15

16 A minor difference between our proof and the arguments appearing in [4] should be noted. George and Harrison require left-continuity of c( ) because they truncate holding costs, which means that the sequence of truncated optimal values converge to z from the left. Since we truncate the buffer level instead, our sequence of optimal values converge to z from the right, requiring right-continuity of c( ) and the related function ψ( ) for the argument above (left-continuity is used in earlier sections). Though yn < κ for n 1, the optimal service rates associated with z( ) can grow without bound as the number of customers in the system increases, i.e., the optimal service rates in this case could be unbounded. 5 Numerical Examples In this section we illustrate the algorithm developed in the last section with a few numerical examples. First, we present a result which provides a lower bound on the optimality gap between the control policy obtained after an iteration of the algorithm, and the optimal control policy. George and Harrison [4] also provided a bound on the optimality gap for their algorithm. However, their algorithm evolves in a different manner, since they truncate holding costs rather than the state space. Theorem 6. If z(n) > z(n + 1) for some n 1, and z(k) > z(n) for all k < n then z(n) z κ y n n. (32) Proof. Since z(n) > z(n + 1), the n-terminating policy is not globally optimal. It follows that y n n < κ. (33) If the above inequality does not hold then setting y m = y n n for m > n, we have a solution pair to the optimality equations when the holding cost vector is modified to be (h 0, h 1,, h n 1, h n, h n, ), from Theorem 1 and Lemma 1. This implies z = z(n) which contradicts our assumption. Set h m = h m θ n for m < n, where θ n = κ y n n. From (33) θ n is positive. For a system with modified holding costs (h 0, h 1,, h n 1, h n, h n, ), the pair (z(n) θ n, (y 1 n, y 2 n,, y n n 1, y n n, y n n, )) satisfies the optimality equations. Since the modified holding costs are less than or equal to the original costs in every state, we conclude which immediately implies the result. z(n) θ n z, Thus, after each iteration of the algorithm, one has an easily computable bound on the optimality gap, which allows for the implementation of a stopping criterion, whether or not the optimal policy is a terminating policy. When the limiting policy is the optimal policy then, θ n 0 as n. Note that Theorem 6 holds irrespective of the holding cost structure. 16

17 5.1 Example Control Problems Figure 1: Optimal buffer size: c(x) = x 2 In all of the numerical examples in this section we take A = [0, ) and set h n = h 0 + s(n M + 1) +, where s and M are parameters that are varied across cases. In all the cases, s is strictly positive and M is an integer. In the first example we consider the quadratic cost structure for service used in [4], in particular, c(x) = x 2. Further, we set h 0 = 10 and M = 1. For this example, the optimal policies are terminating policies. Figure 1 shows the variation in the buffer limit under the optimal policy as the rejection cost κ and the multiplicative factor s in the holding cost are varied. It is clear that for fixed κ, the optimal buffer size increases with decreasing s and further, this effect is more prominent for smaller values of s. An interesting result is the apparent lack of monotonicity in the optimal buffer size with respect to κ, for a fixed value of s. One also observes that this apparent lack of monotonicity is prominent only once the optimal buffer size saturates. As expected, the optimal buffer size is highest when the rejection cost is large and s is small. For small values of the rejection cost, irrespective of s, all customers are rejected. This is due primarily to the particular values chosen for the holding cost function. For the same parameter settings, the optimal values are plotted against s and κ in Figure 2. As expected, the optimal value increase with either increased rejection or holding cost. However, note that the optimal values saturate relatively quickly in the rejection cost and this saturation occurs faster for smaller values of s. The optimal buffer size and optimal objective values have opposite trends with respect to s, i.e., as the optimal cost decreases the optimal buffer size increases. The next example is selected such that the cost for service per unit time approaches the rejection cost as the service rate approaches. In this case A5 holds with equality. The cost for service is c(x) = x x 1 1+ε, where ε is a positive parameter that is varied in 17

18 Figure 2: Optimal cost: c(x) = x 2 Figure 3: Optimal buffer size: c(x) = x x 1 1+ε 18

19 Figure 4: Optimal cost: c(x) = x x 1 1+ε the numerical experiments. In Figures 3 and 4, the optimal buffer limit and optimal cost, respectively, are plotted on the vertical axis, against ε and 1. The other parameters are fixed s at κ = 1, M = 1 and h 0 = 1. As in the first example studied, there is an apparent lack of monotonicity of the buffer limit, with respect to ε, when the value of s is fixed. Note that both the optimal cost and optimal buffer size are essentially insensitive to the value of s. Clearly, as the service cost function decreases, i.e., ε increases, the optimal value decreases. The algorithm presented in this paper is quite efficient, in the following sense. In Step 1 of the algorithm, one need only perform a one-dimensional search in terms of z, to satisfy the terminating optimality equations. Generally speaking the computation of φ( ) is not intense and thus Step 1 is computationally inexpensive. The number of these steps which must be performed is simply linear in the truncation level. Hence, for most practical problems, the algorithm will terminate, or converge, in less than a second on a standard desktop computer. Although we did not do so, one could also fine tune the algorithm to make optimal re-use of the one-dimensional search results, which would increase the algorithmic efficiency even more. 6 Optimality of Deterministic Policies In George and Harrison s paper an important question regarding the control of their model was left unresolved. In particular, it remains to show that the search for an optimal policy can be restricted to deterministic stationary policies. For a large class of MDP s this question has been settled by previous work. For example, control problems with adjustable service and service rates along with additional constraints are discussed in [5, Section 6] and existence of an optimal stationary policy is established. However, there it is assumed that the service 19

20 rates are bounded. When the action space is uncountable, and non-compact, there are fewer results. As mentioned earlier Chapter 5 of Hernández-Lerma and Lasserre [6] provides conditions for the optimality of stationary deterministic policies. The primary requirement there is for the transition probabilities to be setwise continuous. However, the results are given only for discrete-time MDP s. From basic principles, we now proceed to show that randomized policies need not be considered, assuming that one restricts to stationary policies a priori. So, we now consider an arbitrary stationary policy, which may be randomized. When such a policy is considered, at any point of time the control can be considered to have two components: admission control and service rate control. Service rate control is applied only when a customer is present in the system and admission control is applied only when a new customer arrives. When control rates are random, the associated decision process is a continuous time semi- Markov decision process (SMDP). Using the SMDP framework, we directly show whenever a solution to the original optimality equations exists, and there is a corresponding stationary ergodic policy, then no randomized policy can do better. Hence, it is sufficient to restrict the search to deterministic policies. We model the randomized decision in each state as follows. After a system event, the service rate X n is assumed to be a random variable with c.d.f. F n ( ). Clearly, X n must be restricted to take values in A. In this case, a system event is an arrival or departure of a customer. Furthermore, when there is an arrival to a system which is in state n, it is assumed that the customer is accepted with probability q n. For clarity, note that we allow the controller to choose a new service rate X n (drawn from F n ( )) even if an incoming arrival is rejected. Thus the control decision now is to choose probabilities {q n, n 0} and the c.d.f. s {F n ( ), n 1} which minimize the long-run cost. Specifically, when the number of customers in the system is n 0, before an arrival or after a departure, the randomized policy is specified by Q n (q n, F n ( )) and the entire control policy is given by the functional vector Q {Q n, n 0}. Below, the expectation operator is defined with respect to Q. So, for example, E[X n ] is the expected service rate under this policy, when there are n jobs in the system. Under Q, we assume that E[X n ] < for all n. A policy with E[X n ] = will have an average cost per unit time equal to the cost when customers are always rejected in state n. Under a general stationary policy, the service time in a state n follows a distribution which is determined by Q n. Given such a policy, it should be clear that the queue length process is then a semi-markov process. Under an ergodic policy Q with E[X n ] < there exist steady-state probabilities v n which represent the long-run average amount of time the SMDP spends in state n. Note that v n > 0 for states below the rejection threshold. Using standard ergodic theory for SMDP s, the long-run cost average cost under any ergodic stationary policy Q is: z( Q) = v n [h n + (1 q n )κ] + n=0 v n E[c(X n )]. (34) We now provide the extension to Theorem 1 which establishes the optimality of deter- 20 n=1

21 ministic stationary policies. Theorem 7. If there exist a z < and uniformly bounded (y 1, y 2,... ), satisfying the optimality equations (4b)-(6), and the corresponding policy as specified in Theorem 1 is ergodic, then z z( Q) for every stationary policy Q. Proof. Consider then a fixed stationary policy Q, and a bounded solution z, (y 1, y 2,...) to the optimality equations. Let µ n := E[X n ] be the mean service rate in state n, under Q. First, one can derive the following detailed balance equations for the SMDP via the embedded Markov chain (which is a simple random walk with self-transitions): Next, equations (7) and(8) imply: q n ν n = ν n+1 µ n+1 for n 0. (35) xy n c(x) y n+1 + h n z x A, n 1, xy n c(x) κ + h n z x A, n 1. Taking the convex combination of these inequalities induced by (q n, 1 q n ) yields: xy n c(x) (1 q n )κ y n+1 q n h n z. This inequality holds for each n 1 and x A, thus it also holds for each realization of X n taking values in A, i.e., X n y n c(x n ) (1 q n )κ y n+1 q n h n z w.p. 1, for each n 1. Taking expectations gives: y n µ n E[c(X n )] (1 q n )κ y n+1 q n h n z n 1. (36) Next, by definition if n = 0 is not a terminating state under the policy corresponding to z, (y 1, y 2,...) then y 1 < κ. If n = 0 is a terminating state then it is straightforward to check that y 1 = κ. Thus, in either case, y 1 κ. Combining this with (4b) yields z h 0 κ. Taking a convex combination of (4b) and this last inequality gives Now, multiplying (36) by ν n for each n 1 we obtain: Applying (35) gives z h 0 y 1 q 0 + (1 q 0 )κ. (37) y n ν n µ n ν n E[c(X n )] ν n (1 q n )κ ν n q n y n+1 ν n [h n z]. y n q n 1 ν n 1 y n+1 q n ν n ν n E[c(X n )] ν n (1 q n )κ ν n [h n z] n 1. 21

Rate control of a queue with quality-of-service constraint under bounded and unbounded. action spaces. Abdolghani Ebrahimi

Rate control of a queue with quality-of-service constraint under bounded and unbounded. action spaces. Abdolghani Ebrahimi Rate control of a queue with quality-of-service constraint under bounded and unbounded action spaces by Abdolghani Ebrahimi A thesis submitted to the graduate faculty in partial fulfillment of the requirements

More information

arxiv: v1 [math.pr] 6 Apr 2015

arxiv: v1 [math.pr] 6 Apr 2015 Analysis of the Optimal Resource Allocation for a Tandem Queueing System arxiv:1504.01248v1 [math.pr] 6 Apr 2015 Liu Zaiming, Chen Gang, Wu Jinbiao School of Mathematics and Statistics, Central South University,

More information

Forecast Horizons for Production Planning with Stochastic Demand

Forecast Horizons for Production Planning with Stochastic Demand Forecast Horizons for Production Planning with Stochastic Demand Alfredo Garcia and Robert L. Smith Department of Industrial and Operations Engineering Universityof Michigan, Ann Arbor MI 48109 December

More information

1 Appendix A: Definition of equilibrium

1 Appendix A: Definition of equilibrium Online Appendix to Partnerships versus Corporations: Moral Hazard, Sorting and Ownership Structure Ayca Kaya and Galina Vereshchagina Appendix A formally defines an equilibrium in our model, Appendix B

More information

Yao s Minimax Principle

Yao s Minimax Principle Complexity of algorithms The complexity of an algorithm is usually measured with respect to the size of the input, where size may for example refer to the length of a binary word describing the input,

More information

Essays on Some Combinatorial Optimization Problems with Interval Data

Essays on Some Combinatorial Optimization Problems with Interval Data Essays on Some Combinatorial Optimization Problems with Interval Data a thesis submitted to the department of industrial engineering and the institute of engineering and sciences of bilkent university

More information

16 MAKING SIMPLE DECISIONS

16 MAKING SIMPLE DECISIONS 247 16 MAKING SIMPLE DECISIONS Let us associate each state S with a numeric utility U(S), which expresses the desirability of the state A nondeterministic action A will have possible outcome states Result

More information

Revenue Management Under the Markov Chain Choice Model

Revenue Management Under the Markov Chain Choice Model Revenue Management Under the Markov Chain Choice Model Jacob B. Feldman School of Operations Research and Information Engineering, Cornell University, Ithaca, New York 14853, USA jbf232@cornell.edu Huseyin

More information

Martingale Pricing Theory in Discrete-Time and Discrete-Space Models

Martingale Pricing Theory in Discrete-Time and Discrete-Space Models IEOR E4707: Foundations of Financial Engineering c 206 by Martin Haugh Martingale Pricing Theory in Discrete-Time and Discrete-Space Models These notes develop the theory of martingale pricing in a discrete-time,

More information

17 MAKING COMPLEX DECISIONS

17 MAKING COMPLEX DECISIONS 267 17 MAKING COMPLEX DECISIONS The agent s utility now depends on a sequence of decisions In the following 4 3grid environment the agent makes a decision to move (U, R, D, L) at each time step When the

More information

16 MAKING SIMPLE DECISIONS

16 MAKING SIMPLE DECISIONS 253 16 MAKING SIMPLE DECISIONS Let us associate each state S with a numeric utility U(S), which expresses the desirability of the state A nondeterministic action a will have possible outcome states Result(a)

More information

Optimal Stopping. Nick Hay (presentation follows Thomas Ferguson s Optimal Stopping and Applications) November 6, 2008

Optimal Stopping. Nick Hay (presentation follows Thomas Ferguson s Optimal Stopping and Applications) November 6, 2008 (presentation follows Thomas Ferguson s and Applications) November 6, 2008 1 / 35 Contents: Introduction Problems Markov Models Monotone Stopping Problems Summary 2 / 35 The Secretary problem You have

More information

On Existence of Equilibria. Bayesian Allocation-Mechanisms

On Existence of Equilibria. Bayesian Allocation-Mechanisms On Existence of Equilibria in Bayesian Allocation Mechanisms Northwestern University April 23, 2014 Bayesian Allocation Mechanisms In allocation mechanisms, agents choose messages. The messages determine

More information

The Value of Information in Central-Place Foraging. Research Report

The Value of Information in Central-Place Foraging. Research Report The Value of Information in Central-Place Foraging. Research Report E. J. Collins A. I. Houston J. M. McNamara 22 February 2006 Abstract We consider a central place forager with two qualitatively different

More information

3 Arbitrage pricing theory in discrete time.

3 Arbitrage pricing theory in discrete time. 3 Arbitrage pricing theory in discrete time. Orientation. In the examples studied in Chapter 1, we worked with a single period model and Gaussian returns; in this Chapter, we shall drop these assumptions

More information

Finite Memory and Imperfect Monitoring

Finite Memory and Imperfect Monitoring Federal Reserve Bank of Minneapolis Research Department Finite Memory and Imperfect Monitoring Harold L. Cole and Narayana Kocherlakota Working Paper 604 September 2000 Cole: U.C.L.A. and Federal Reserve

More information

Constrained Sequential Resource Allocation and Guessing Games

Constrained Sequential Resource Allocation and Guessing Games 4946 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 54, NO. 11, NOVEMBER 2008 Constrained Sequential Resource Allocation and Guessing Games Nicholas B. Chang and Mingyan Liu, Member, IEEE Abstract In this

More information

Self-organized criticality on the stock market

Self-organized criticality on the stock market Prague, January 5th, 2014. Some classical ecomomic theory In classical economic theory, the price of a commodity is determined by demand and supply. Let D(p) (resp. S(p)) be the total demand (resp. supply)

More information

1 Overview. 2 The Gradient Descent Algorithm. AM 221: Advanced Optimization Spring 2016

1 Overview. 2 The Gradient Descent Algorithm. AM 221: Advanced Optimization Spring 2016 AM 22: Advanced Optimization Spring 206 Prof. Yaron Singer Lecture 9 February 24th Overview In the previous lecture we reviewed results from multivariate calculus in preparation for our journey into convex

More information

Richardson Extrapolation Techniques for the Pricing of American-style Options

Richardson Extrapolation Techniques for the Pricing of American-style Options Richardson Extrapolation Techniques for the Pricing of American-style Options June 1, 2005 Abstract Richardson Extrapolation Techniques for the Pricing of American-style Options In this paper we re-examine

More information

THE TRAVELING SALESMAN PROBLEM FOR MOVING POINTS ON A LINE

THE TRAVELING SALESMAN PROBLEM FOR MOVING POINTS ON A LINE THE TRAVELING SALESMAN PROBLEM FOR MOVING POINTS ON A LINE GÜNTER ROTE Abstract. A salesperson wants to visit each of n objects that move on a line at given constant speeds in the shortest possible time,

More information

Approximate Revenue Maximization with Multiple Items

Approximate Revenue Maximization with Multiple Items Approximate Revenue Maximization with Multiple Items Nir Shabbat - 05305311 December 5, 2012 Introduction The paper I read is called Approximate Revenue Maximization with Multiple Items by Sergiu Hart

More information

Dynamic pricing and scheduling in a multi-class single-server queueing system

Dynamic pricing and scheduling in a multi-class single-server queueing system DOI 10.1007/s11134-011-9214-5 Dynamic pricing and scheduling in a multi-class single-server queueing system Eren Başar Çil Fikri Karaesmen E. Lerzan Örmeci Received: 3 April 2009 / Revised: 21 January

More information

An Application of Ramsey Theorem to Stopping Games

An Application of Ramsey Theorem to Stopping Games An Application of Ramsey Theorem to Stopping Games Eran Shmaya, Eilon Solan and Nicolas Vieille July 24, 2001 Abstract We prove that every two-player non zero-sum deterministic stopping game with uniformly

More information

ONLY AVAILABLE IN ELECTRONIC FORM

ONLY AVAILABLE IN ELECTRONIC FORM OPERATIONS RESEARCH doi 10.1287/opre.1080.0632ec pp. ec1 ec12 e-companion ONLY AVAILABLE IN ELECTRONIC FORM informs 2009 INFORMS Electronic Companion Index Policies for the Admission Control and Routing

More information

Maximum Contiguous Subsequences

Maximum Contiguous Subsequences Chapter 8 Maximum Contiguous Subsequences In this chapter, we consider a well-know problem and apply the algorithm-design techniques that we have learned thus far to this problem. While applying these

More information

GUESSING MODELS IMPLY THE SINGULAR CARDINAL HYPOTHESIS arxiv: v1 [math.lo] 25 Mar 2019

GUESSING MODELS IMPLY THE SINGULAR CARDINAL HYPOTHESIS arxiv: v1 [math.lo] 25 Mar 2019 GUESSING MODELS IMPLY THE SINGULAR CARDINAL HYPOTHESIS arxiv:1903.10476v1 [math.lo] 25 Mar 2019 Abstract. In this article we prove three main theorems: (1) guessing models are internally unbounded, (2)

More information

Sequential Decision Making

Sequential Decision Making Sequential Decision Making Dynamic programming Christos Dimitrakakis Intelligent Autonomous Systems, IvI, University of Amsterdam, The Netherlands March 18, 2008 Introduction Some examples Dynamic programming

More information

An optimal policy for joint dynamic price and lead-time quotation

An optimal policy for joint dynamic price and lead-time quotation Lingnan University From the SelectedWorks of Prof. LIU Liming November, 2011 An optimal policy for joint dynamic price and lead-time quotation Jiejian FENG Liming LIU, Lingnan University, Hong Kong Xianming

More information

Lecture 7: Bayesian approach to MAB - Gittins index

Lecture 7: Bayesian approach to MAB - Gittins index Advanced Topics in Machine Learning and Algorithmic Game Theory Lecture 7: Bayesian approach to MAB - Gittins index Lecturer: Yishay Mansour Scribe: Mariano Schain 7.1 Introduction In the Bayesian approach

More information

Augmenting Revenue Maximization Policies for Facilities where Customers Wait for Service

Augmenting Revenue Maximization Policies for Facilities where Customers Wait for Service Augmenting Revenue Maximization Policies for Facilities where Customers Wait for Service Avi Giloni Syms School of Business, Yeshiva University, BH-428, 500 W 185th St., New York, NY 10033 agiloni@yu.edu

More information

1 Precautionary Savings: Prudence and Borrowing Constraints

1 Precautionary Savings: Prudence and Borrowing Constraints 1 Precautionary Savings: Prudence and Borrowing Constraints In this section we study conditions under which savings react to changes in income uncertainty. Recall that in the PIH, when you abstract from

More information

,,, be any other strategy for selling items. It yields no more revenue than, based on the

,,, be any other strategy for selling items. It yields no more revenue than, based on the ONLINE SUPPLEMENT Appendix 1: Proofs for all Propositions and Corollaries Proof of Proposition 1 Proposition 1: For all 1,2,,, if, is a non-increasing function with respect to (henceforth referred to as

More information

Log-linear Dynamics and Local Potential

Log-linear Dynamics and Local Potential Log-linear Dynamics and Local Potential Daijiro Okada and Olivier Tercieux [This version: November 28, 2008] Abstract We show that local potential maximizer ([15]) with constant weights is stochastically

More information

4: SINGLE-PERIOD MARKET MODELS

4: SINGLE-PERIOD MARKET MODELS 4: SINGLE-PERIOD MARKET MODELS Marek Rutkowski School of Mathematics and Statistics University of Sydney Semester 2, 2016 M. Rutkowski (USydney) Slides 4: Single-Period Market Models 1 / 87 General Single-Period

More information

Appendix A: Introduction to Queueing Theory

Appendix A: Introduction to Queueing Theory Appendix A: Introduction to Queueing Theory Queueing theory is an advanced mathematical modeling technique that can estimate waiting times. Imagine customers who wait in a checkout line at a grocery store.

More information

DRAFT. 1 exercise in state (S, t), π(s, t) = 0 do not exercise in state (S, t) Review of the Risk Neutral Stock Dynamics

DRAFT. 1 exercise in state (S, t), π(s, t) = 0 do not exercise in state (S, t) Review of the Risk Neutral Stock Dynamics Chapter 12 American Put Option Recall that the American option has strike K and maturity T and gives the holder the right to exercise at any time in [0, T ]. The American option is not straightforward

More information

University of Groningen. Inventory Control for Multi-location Rental Systems van der Heide, Gerlach

University of Groningen. Inventory Control for Multi-location Rental Systems van der Heide, Gerlach University of Groningen Inventory Control for Multi-location Rental Systems van der Heide, Gerlach IMPORTANT NOTE: You are advised to consult the publisher's version publisher's PDF) if you wish to cite

More information

GAME THEORY. Department of Economics, MIT, Follow Muhamet s slides. We need the following result for future reference.

GAME THEORY. Department of Economics, MIT, Follow Muhamet s slides. We need the following result for future reference. 14.126 GAME THEORY MIHAI MANEA Department of Economics, MIT, 1. Existence and Continuity of Nash Equilibria Follow Muhamet s slides. We need the following result for future reference. Theorem 1. Suppose

More information

Lecture 5: Iterative Combinatorial Auctions

Lecture 5: Iterative Combinatorial Auctions COMS 6998-3: Algorithmic Game Theory October 6, 2008 Lecture 5: Iterative Combinatorial Auctions Lecturer: Sébastien Lahaie Scribe: Sébastien Lahaie In this lecture we examine a procedure that generalizes

More information

Introduction to Probability Theory and Stochastic Processes for Finance Lecture Notes

Introduction to Probability Theory and Stochastic Processes for Finance Lecture Notes Introduction to Probability Theory and Stochastic Processes for Finance Lecture Notes Fabio Trojani Department of Economics, University of St. Gallen, Switzerland Correspondence address: Fabio Trojani,

More information

Martingales. by D. Cox December 2, 2009

Martingales. by D. Cox December 2, 2009 Martingales by D. Cox December 2, 2009 1 Stochastic Processes. Definition 1.1 Let T be an arbitrary index set. A stochastic process indexed by T is a family of random variables (X t : t T) defined on a

More information

Online Appendix for Military Mobilization and Commitment Problems

Online Appendix for Military Mobilization and Commitment Problems Online Appendix for Military Mobilization and Commitment Problems Ahmer Tarar Department of Political Science Texas A&M University 4348 TAMU College Station, TX 77843-4348 email: ahmertarar@pols.tamu.edu

More information

Lecture Quantitative Finance Spring Term 2015

Lecture Quantitative Finance Spring Term 2015 implied Lecture Quantitative Finance Spring Term 2015 : May 7, 2015 1 / 28 implied 1 implied 2 / 28 Motivation and setup implied the goal of this chapter is to treat the implied which requires an algorithm

More information

An Approximation Algorithm for Capacity Allocation over a Single Flight Leg with Fare-Locking

An Approximation Algorithm for Capacity Allocation over a Single Flight Leg with Fare-Locking An Approximation Algorithm for Capacity Allocation over a Single Flight Leg with Fare-Locking Mika Sumida School of Operations Research and Information Engineering, Cornell University, Ithaca, New York

More information

Optimal Control of Batch Service Queues with Finite Service Capacity and General Holding Costs

Optimal Control of Batch Service Queues with Finite Service Capacity and General Holding Costs Queueing Colloquium, CWI, Amsterdam, February 24, 1999 Optimal Control of Batch Service Queues with Finite Service Capacity and General Holding Costs Samuli Aalto EURANDOM Eindhoven 24-2-99 cwi.ppt 1 Background

More information

Sy D. Friedman. August 28, 2001

Sy D. Friedman. August 28, 2001 0 # and Inner Models Sy D. Friedman August 28, 2001 In this paper we examine the cardinal structure of inner models that satisfy GCH but do not contain 0 #. We show, assuming that 0 # exists, that such

More information

6. Martingales. = Zn. Think of Z n+1 as being a gambler s earnings after n+1 games. If the game if fair, then E [ Z n+1 Z n

6. Martingales. = Zn. Think of Z n+1 as being a gambler s earnings after n+1 games. If the game if fair, then E [ Z n+1 Z n 6. Martingales For casino gamblers, a martingale is a betting strategy where (at even odds) the stake doubled each time the player loses. Players follow this strategy because, since they will eventually

More information

1 Consumption and saving under uncertainty

1 Consumption and saving under uncertainty 1 Consumption and saving under uncertainty 1.1 Modelling uncertainty As in the deterministic case, we keep assuming that agents live for two periods. The novelty here is that their earnings in the second

More information

Laws of probabilities in efficient markets

Laws of probabilities in efficient markets Laws of probabilities in efficient markets Vladimir Vovk Department of Computer Science Royal Holloway, University of London Fifth Workshop on Game-Theoretic Probability and Related Topics 15 November

More information

So we turn now to many-to-one matching with money, which is generally seen as a model of firms hiring workers

So we turn now to many-to-one matching with money, which is generally seen as a model of firms hiring workers Econ 805 Advanced Micro Theory I Dan Quint Fall 2009 Lecture 20 November 13 2008 So far, we ve considered matching markets in settings where there is no money you can t necessarily pay someone to marry

More information

Building Infinite Processes from Regular Conditional Probability Distributions

Building Infinite Processes from Regular Conditional Probability Distributions Chapter 3 Building Infinite Processes from Regular Conditional Probability Distributions Section 3.1 introduces the notion of a probability kernel, which is a useful way of systematizing and extending

More information

COMBINATORICS OF REDUCTIONS BETWEEN EQUIVALENCE RELATIONS

COMBINATORICS OF REDUCTIONS BETWEEN EQUIVALENCE RELATIONS COMBINATORICS OF REDUCTIONS BETWEEN EQUIVALENCE RELATIONS DAN HATHAWAY AND SCOTT SCHNEIDER Abstract. We discuss combinatorial conditions for the existence of various types of reductions between equivalence

More information

Information Acquisition under Persuasive Precedent versus Binding Precedent (Preliminary and Incomplete)

Information Acquisition under Persuasive Precedent versus Binding Precedent (Preliminary and Incomplete) Information Acquisition under Persuasive Precedent versus Binding Precedent (Preliminary and Incomplete) Ying Chen Hülya Eraslan March 25, 2016 Abstract We analyze a dynamic model of judicial decision

More information

Final exam solutions

Final exam solutions EE365 Stochastic Control / MS&E251 Stochastic Decision Models Profs. S. Lall, S. Boyd June 5 6 or June 6 7, 2013 Final exam solutions This is a 24 hour take-home final. Please turn it in to one of the

More information

Stochastic Optimal Control

Stochastic Optimal Control Stochastic Optimal Control Lecturer: Eilyan Bitar, Cornell ECE Scribe: Kevin Kircher, Cornell MAE These notes summarize some of the material from ECE 5555 (Stochastic Systems) at Cornell in the fall of

More information

Lecture 17: More on Markov Decision Processes. Reinforcement learning

Lecture 17: More on Markov Decision Processes. Reinforcement learning Lecture 17: More on Markov Decision Processes. Reinforcement learning Learning a model: maximum likelihood Learning a value function directly Monte Carlo Temporal-difference (TD) learning COMP-424, Lecture

More information

Dynamic Programming: An overview. 1 Preliminaries: The basic principle underlying dynamic programming

Dynamic Programming: An overview. 1 Preliminaries: The basic principle underlying dynamic programming Dynamic Programming: An overview These notes summarize some key properties of the Dynamic Programming principle to optimize a function or cost that depends on an interval or stages. This plays a key role

More information

The value of foresight

The value of foresight Philip Ernst Department of Statistics, Rice University Support from NSF-DMS-1811936 (co-pi F. Viens) and ONR-N00014-18-1-2192 gratefully acknowledged. IMA Financial and Economic Applications June 11, 2018

More information

Handout 8: Introduction to Stochastic Dynamic Programming. 2 Examples of Stochastic Dynamic Programming Problems

Handout 8: Introduction to Stochastic Dynamic Programming. 2 Examples of Stochastic Dynamic Programming Problems SEEM 3470: Dynamic Optimization and Applications 2013 14 Second Term Handout 8: Introduction to Stochastic Dynamic Programming Instructor: Shiqian Ma March 10, 2014 Suggested Reading: Chapter 1 of Bertsekas,

More information

Lecture 2: Making Good Sequences of Decisions Given a Model of World. CS234: RL Emma Brunskill Winter 2018

Lecture 2: Making Good Sequences of Decisions Given a Model of World. CS234: RL Emma Brunskill Winter 2018 Lecture 2: Making Good Sequences of Decisions Given a Model of World CS234: RL Emma Brunskill Winter 218 Human in the loop exoskeleton work from Steve Collins lab Class Structure Last Time: Introduction

More information

Pricing Problems under the Markov Chain Choice Model

Pricing Problems under the Markov Chain Choice Model Pricing Problems under the Markov Chain Choice Model James Dong School of Operations Research and Information Engineering, Cornell University, Ithaca, New York 14853, USA jd748@cornell.edu A. Serdar Simsek

More information

EE266 Homework 5 Solutions

EE266 Homework 5 Solutions EE, Spring 15-1 Professor S. Lall EE Homework 5 Solutions 1. A refined inventory model. In this problem we consider an inventory model that is more refined than the one you ve seen in the lectures. The

More information

PAULI MURTO, ANDREY ZHUKOV

PAULI MURTO, ANDREY ZHUKOV GAME THEORY SOLUTION SET 1 WINTER 018 PAULI MURTO, ANDREY ZHUKOV Introduction For suggested solution to problem 4, last year s suggested solutions by Tsz-Ning Wong were used who I think used suggested

More information

On the Lower Arbitrage Bound of American Contingent Claims

On the Lower Arbitrage Bound of American Contingent Claims On the Lower Arbitrage Bound of American Contingent Claims Beatrice Acciaio Gregor Svindland December 2011 Abstract We prove that in a discrete-time market model the lower arbitrage bound of an American

More information

The ruin probabilities of a multidimensional perturbed risk model

The ruin probabilities of a multidimensional perturbed risk model MATHEMATICAL COMMUNICATIONS 231 Math. Commun. 18(2013, 231 239 The ruin probabilities of a multidimensional perturbed risk model Tatjana Slijepčević-Manger 1, 1 Faculty of Civil Engineering, University

More information

THE OPTIMAL ASSET ALLOCATION PROBLEMFOR AN INVESTOR THROUGH UTILITY MAXIMIZATION

THE OPTIMAL ASSET ALLOCATION PROBLEMFOR AN INVESTOR THROUGH UTILITY MAXIMIZATION THE OPTIMAL ASSET ALLOCATION PROBLEMFOR AN INVESTOR THROUGH UTILITY MAXIMIZATION SILAS A. IHEDIOHA 1, BRIGHT O. OSU 2 1 Department of Mathematics, Plateau State University, Bokkos, P. M. B. 2012, Jos,

More information

LECTURE 2: MULTIPERIOD MODELS AND TREES

LECTURE 2: MULTIPERIOD MODELS AND TREES LECTURE 2: MULTIPERIOD MODELS AND TREES 1. Introduction One-period models, which were the subject of Lecture 1, are of limited usefulness in the pricing and hedging of derivative securities. In real-world

More information

B. Online Appendix. where ɛ may be arbitrarily chosen to satisfy 0 < ɛ < s 1 and s 1 is defined in (B1). This can be rewritten as

B. Online Appendix. where ɛ may be arbitrarily chosen to satisfy 0 < ɛ < s 1 and s 1 is defined in (B1). This can be rewritten as B Online Appendix B1 Constructing examples with nonmonotonic adoption policies Assume c > 0 and the utility function u(w) is increasing and approaches as w approaches 0 Suppose we have a prior distribution

More information

Admissioncontrolwithbatcharrivals

Admissioncontrolwithbatcharrivals Admissioncontrolwithbatcharrivals E. Lerzan Örmeci Department of Industrial Engineering Koç University Sarıyer 34450 İstanbul-Turkey Apostolos Burnetas Department of Operations Weatherhead School of Management

More information

On the Optimality of a Family of Binary Trees Techical Report TR

On the Optimality of a Family of Binary Trees Techical Report TR On the Optimality of a Family of Binary Trees Techical Report TR-011101-1 Dana Vrajitoru and William Knight Indiana University South Bend Department of Computer and Information Sciences Abstract In this

More information

Chapter 9 Dynamic Models of Investment

Chapter 9 Dynamic Models of Investment George Alogoskoufis, Dynamic Macroeconomic Theory, 2015 Chapter 9 Dynamic Models of Investment In this chapter we present the main neoclassical model of investment, under convex adjustment costs. This

More information

based on two joint papers with Sara Biagini Scuola Normale Superiore di Pisa, Università degli Studi di Perugia

based on two joint papers with Sara Biagini Scuola Normale Superiore di Pisa, Università degli Studi di Perugia Marco Frittelli Università degli Studi di Firenze Winter School on Mathematical Finance January 24, 2005 Lunteren. On Utility Maximization in Incomplete Markets. based on two joint papers with Sara Biagini

More information

Online Appendix Optimal Time-Consistent Government Debt Maturity D. Debortoli, R. Nunes, P. Yared. A. Proofs

Online Appendix Optimal Time-Consistent Government Debt Maturity D. Debortoli, R. Nunes, P. Yared. A. Proofs Online Appendi Optimal Time-Consistent Government Debt Maturity D. Debortoli, R. Nunes, P. Yared A. Proofs Proof of Proposition 1 The necessity of these conditions is proved in the tet. To prove sufficiency,

More information

Total Reward Stochastic Games and Sensitive Average Reward Strategies

Total Reward Stochastic Games and Sensitive Average Reward Strategies JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS: Vol. 98, No. 1, pp. 175-196, JULY 1998 Total Reward Stochastic Games and Sensitive Average Reward Strategies F. THUIJSMAN1 AND O, J. VaiEZE2 Communicated

More information

KIER DISCUSSION PAPER SERIES

KIER DISCUSSION PAPER SERIES KIER DISCUSSION PAPER SERIES KYOTO INSTITUTE OF ECONOMIC RESEARCH http://www.kier.kyoto-u.ac.jp/index.html Discussion Paper No. 657 The Buy Price in Auctions with Discrete Type Distributions Yusuke Inami

More information

Comparison of proof techniques in game-theoretic probability and measure-theoretic probability

Comparison of proof techniques in game-theoretic probability and measure-theoretic probability Comparison of proof techniques in game-theoretic probability and measure-theoretic probability Akimichi Takemura, Univ. of Tokyo March 31, 2008 1 Outline: A.Takemura 0. Background and our contributions

More information

CONVERGENCE OF OPTION REWARDS FOR MARKOV TYPE PRICE PROCESSES MODULATED BY STOCHASTIC INDICES

CONVERGENCE OF OPTION REWARDS FOR MARKOV TYPE PRICE PROCESSES MODULATED BY STOCHASTIC INDICES CONVERGENCE OF OPTION REWARDS FOR MARKOV TYPE PRICE PROCESSES MODULATED BY STOCHASTIC INDICES D. S. SILVESTROV, H. JÖNSSON, AND F. STENBERG Abstract. A general price process represented by a two-component

More information

Pricing Dynamic Solvency Insurance and Investment Fund Protection

Pricing Dynamic Solvency Insurance and Investment Fund Protection Pricing Dynamic Solvency Insurance and Investment Fund Protection Hans U. Gerber and Gérard Pafumi Switzerland Abstract In the first part of the paper the surplus of a company is modelled by a Wiener process.

More information

Bounding Optimal Expected Revenues for Assortment Optimization under Mixtures of Multinomial Logits

Bounding Optimal Expected Revenues for Assortment Optimization under Mixtures of Multinomial Logits Bounding Optimal Expected Revenues for Assortment Optimization under Mixtures of Multinomial Logits Jacob Feldman School of Operations Research and Information Engineering, Cornell University, Ithaca,

More information

INTERIM CORRELATED RATIONALIZABILITY IN INFINITE GAMES

INTERIM CORRELATED RATIONALIZABILITY IN INFINITE GAMES INTERIM CORRELATED RATIONALIZABILITY IN INFINITE GAMES JONATHAN WEINSTEIN AND MUHAMET YILDIZ A. We show that, under the usual continuity and compactness assumptions, interim correlated rationalizability

More information

Calibration Estimation under Non-response and Missing Values in Auxiliary Information

Calibration Estimation under Non-response and Missing Values in Auxiliary Information WORKING PAPER 2/2015 Calibration Estimation under Non-response and Missing Values in Auxiliary Information Thomas Laitila and Lisha Wang Statistics ISSN 1403-0586 http://www.oru.se/institutioner/handelshogskolan-vid-orebro-universitet/forskning/publikationer/working-papers/

More information

Computational Independence

Computational Independence Computational Independence Björn Fay mail@bfay.de December 20, 2014 Abstract We will introduce different notions of independence, especially computational independence (or more precise independence by

More information

IEOR E4004: Introduction to OR: Deterministic Models

IEOR E4004: Introduction to OR: Deterministic Models IEOR E4004: Introduction to OR: Deterministic Models 1 Dynamic Programming Following is a summary of the problems we discussed in class. (We do not include the discussion on the container problem or the

More information

Course notes for EE394V Restructured Electricity Markets: Locational Marginal Pricing

Course notes for EE394V Restructured Electricity Markets: Locational Marginal Pricing Course notes for EE394V Restructured Electricity Markets: Locational Marginal Pricing Ross Baldick Copyright c 2018 Ross Baldick www.ece.utexas.edu/ baldick/classes/394v/ee394v.html Title Page 1 of 160

More information

Economics 2010c: Lecture 4 Precautionary Savings and Liquidity Constraints

Economics 2010c: Lecture 4 Precautionary Savings and Liquidity Constraints Economics 2010c: Lecture 4 Precautionary Savings and Liquidity Constraints David Laibson 9/11/2014 Outline: 1. Precautionary savings motives 2. Liquidity constraints 3. Application: Numerical solution

More information

Methods and Models of Loss Reserving Based on Run Off Triangles: A Unifying Survey

Methods and Models of Loss Reserving Based on Run Off Triangles: A Unifying Survey Methods and Models of Loss Reserving Based on Run Off Triangles: A Unifying Survey By Klaus D Schmidt Lehrstuhl für Versicherungsmathematik Technische Universität Dresden Abstract The present paper provides

More information

Information aggregation for timing decision making.

Information aggregation for timing decision making. MPRA Munich Personal RePEc Archive Information aggregation for timing decision making. Esteban Colla De-Robertis Universidad Panamericana - Campus México, Escuela de Ciencias Económicas y Empresariales

More information

Kutay Cingiz, János Flesch, P. Jean-Jacques Herings, Arkadi Predtetchinski. Doing It Now, Later, or Never RM/15/022

Kutay Cingiz, János Flesch, P. Jean-Jacques Herings, Arkadi Predtetchinski. Doing It Now, Later, or Never RM/15/022 Kutay Cingiz, János Flesch, P Jean-Jacques Herings, Arkadi Predtetchinski Doing It Now, Later, or Never RM/15/ Doing It Now, Later, or Never Kutay Cingiz János Flesch P Jean-Jacques Herings Arkadi Predtetchinski

More information

AM 121: Intro to Optimization Models and Methods

AM 121: Intro to Optimization Models and Methods AM 121: Intro to Optimization Models and Methods Lecture 18: Markov Decision Processes Yiling Chen and David Parkes Lesson Plan Markov decision processes Policies and Value functions Solving: average reward,

More information

MAT25 LECTURE 10 NOTES. = a b. > 0, there exists N N such that if n N, then a n a < ɛ

MAT25 LECTURE 10 NOTES. = a b. > 0, there exists N N such that if n N, then a n a < ɛ MAT5 LECTURE 0 NOTES NATHANIEL GALLUP. Algebraic Limit Theorem Theorem : Algebraic Limit Theorem (Abbott Theorem.3.3) Let (a n ) and ( ) be sequences of real numbers such that lim n a n = a and lim n =

More information

The Stigler-Luckock model with market makers

The Stigler-Luckock model with market makers Prague, January 7th, 2017. Order book Nowadays, demand and supply is often realized by electronic trading systems storing the information in databases. Traders with access to these databases quote their

More information

Online Appendix for Debt Contracts with Partial Commitment by Natalia Kovrijnykh

Online Appendix for Debt Contracts with Partial Commitment by Natalia Kovrijnykh Online Appendix for Debt Contracts with Partial Commitment by Natalia Kovrijnykh Omitted Proofs LEMMA 5: Function ˆV is concave with slope between 1 and 0. PROOF: The fact that ˆV (w) is decreasing in

More information

Fundamental Theorems of Asset Pricing. 3.1 Arbitrage and risk neutral probability measures

Fundamental Theorems of Asset Pricing. 3.1 Arbitrage and risk neutral probability measures Lecture 3 Fundamental Theorems of Asset Pricing 3.1 Arbitrage and risk neutral probability measures Several important concepts were illustrated in the example in Lecture 2: arbitrage; risk neutral probability

More information

4 Reinforcement Learning Basic Algorithms

4 Reinforcement Learning Basic Algorithms Learning in Complex Systems Spring 2011 Lecture Notes Nahum Shimkin 4 Reinforcement Learning Basic Algorithms 4.1 Introduction RL methods essentially deal with the solution of (optimal) control problems

More information

In Discrete Time a Local Martingale is a Martingale under an Equivalent Probability Measure

In Discrete Time a Local Martingale is a Martingale under an Equivalent Probability Measure In Discrete Time a Local Martingale is a Martingale under an Equivalent Probability Measure Yuri Kabanov 1,2 1 Laboratoire de Mathématiques, Université de Franche-Comté, 16 Route de Gray, 253 Besançon,

More information

E-companion to Coordinating Inventory Control and Pricing Strategies for Perishable Products

E-companion to Coordinating Inventory Control and Pricing Strategies for Perishable Products E-companion to Coordinating Inventory Control and Pricing Strategies for Perishable Products Xin Chen International Center of Management Science and Engineering Nanjing University, Nanjing 210093, China,

More information

Sublinear Time Algorithms Oct 19, Lecture 1

Sublinear Time Algorithms Oct 19, Lecture 1 0368.416701 Sublinear Time Algorithms Oct 19, 2009 Lecturer: Ronitt Rubinfeld Lecture 1 Scribe: Daniel Shahaf 1 Sublinear-time algorithms: motivation Twenty years ago, there was practically no investigation

More information

Assets with possibly negative dividends

Assets with possibly negative dividends Assets with possibly negative dividends (Preliminary and incomplete. Comments welcome.) Ngoc-Sang PHAM Montpellier Business School March 12, 2017 Abstract The paper introduces assets whose dividends can

More information

Macroeconomics and finance

Macroeconomics and finance Macroeconomics and finance 1 1. Temporary equilibrium and the price level [Lectures 11 and 12] 2. Overlapping generations and learning [Lectures 13 and 14] 2.1 The overlapping generations model 2.2 Expectations

More information