Policy Improvement for Repeated Zero-Sum Games with Asymmetric Information

Size: px
Start display at page:

Download "Policy Improvement for Repeated Zero-Sum Games with Asymmetric Information"

Transcription

1 Policy Improvemet for Repeated Zero-Sum Games with Asymmetric Iformatio Malachi Joes ad Jeff S. Shamma Abstract I a repeated zero-sum game, two players repeatedly play the same zero-sum game over several stages. We assume that while both players ca observe the actios of the other, oly oe player kows the actual game, which was radomly selected from a set of possible games accordig to a kow distributio. The dilemma faced by the iformed player is how to trade off the short-term reward versus log-term cosequece of exploitig iformatio, sice exploitatio also risks revelatio. Classic work by Auma ad Maschler derives the recursive value equatio, which quatifies this tradeoff ad derives a formula for optimal policies by the iformed player. However, usig this model for explicit computatios ca be computatioally prohibitive as the umber of game stages icreases. I this paper, we derive a suboptimal policy based o the cocept of policy improvemet. The baselie policy is a o-revealig policy, i.e., oe that completely igores superior iformatio. The improved policy, which is implemeted i a recedig horizo maer, strategizes for the curret stage while assumig a o-revealig policy for future stages. We show that the improved policy ca be computed by solvig a liear program, ad the computatioal complexity of this liear program is costat with respect to the legth of the We derive bouds o the guarateed performace of the improved policy ad establish that the bouds are tight. I. INTRODUCTION We cosider a asymmetric zero-sum game, i which oe player is iformed about the true state of the world. This state determies the specific game that will be played repeatedly. The other player, the uiformed player, has ucertaity about the true state. Sice both players ca observe the actios of their oppoet, the uiformed player ca use his observatios of the iformed player s actios to estimate the true state of the world. Exploitig/revealig iformatio to achieve a short-term reward risks helpig the uiformed player better estimate the true state of the world. A better estimate of the true state ca allow the uiformed player to make better decisios that will cost the iformed player over the log term. A atural questio is how should the iformed player exploit his iformatio. Auma ad Maschler itroduced a recursive formulatio that characterizes the optimal payoff the iformed player ca achieve, which is referred to as the value of the This formulatio evaluates the tradeoff betwee short-term rewards ad log-term costs for all possible decisios of the iformed player, ad the optimal decisio is the decisio This research was supported by ARO/MURI Project W9NF M. Joes ad J.S. Shamma are with the School of Electrical ad Computer Egieerig, College of Egieerig, Georgia Istitute of Techology {kye4u, shamma}@gatech.edu that provides the best overall game payoff. Determiig the optimal decisio becomes icreasigly difficult as the legth of the game grows. Therefore, computig a optimal decisio for games of o-trivial legths ca be computatioal prohibitive. This difficulty exteds also to the simplest zerosum games, which have two states ad two possible actios for each player i a give state. Much of the curret work to address the iformatio exploitatio issue, which icludes the work of Domasky ad Kreps 2 ad Heur 3, has bee limited to fidig optimal strategies for special cases of the simplest zerosum games. Zamir provided a method to geerate optimal strategies uder certai coditios for 2x2 ad 3x3 matrix games 4. Gilpi ad Sadholm 5 propose a algorithm for computig strategies i games by usig a o-covex optimizatio formulatio for oly the ifiitely repeated case. There is work that cosiders usig suboptimal strategies to address the iformatio exploitatio issue for all classes of games 6. I this work, the iformed player ever uses his iformatio throughout the game, ad accordigly is orevealig. While the suboptimal strategies are readily computable, oly uder special circumstaces do these strategies offer strog suboptimal payoffs. I this paper, after itroducig basic repeated zero-sum game cocepts ad defiitios, we itroduce a suboptimal strategy that we refer to as the oe-time policy improvemet strategy. We show that the computatioal complexity of costructig this strategy is costat with respect to the legth of Next, we provide tight bouds o the guarateed performace of the costructed suboptimal policy. We the show that the policy improvemet strategy ca be computed by solvig a liear programig problem. Fially, we preset a illustrative simulatio. II. ZERO-SUM DEFINITIONS AND CONCEPTS I this sectio, we will itroduce basic zero-sum repeated game defiitios relevat to this paper. The first ad most importat cocept is Auma ad Maschler s dyamic programmig equatio for evaluatig the tradeoff betwee shortterm ad log-term payoff. We will the discuss the otio of o-revealig strategies, which will be exploited i our costructio of suboptimal policies to reduce computatioal complexity.

2 A. Setup ) Game Play: Two players repeatedly play a zero-sum matrix game over stages m =, 2,..., N. The row player is the maximizer, ad the colum player is the miimizer. The specific game is selected from a fiite set of possible games (or states of the world), K. Let (L) deote the set of probability distributios over some fiite set, L. Defie S to be the set of pure actios of the row player, ad similarly defie J to be the set of pure actios of the colum player. The game matrix at state k K is deoted M k R S J. Before stage m =, ature selects the specific game accordig to a probability distributio p (K), which is commo kowledge. This selectio remais fixed over all stages. The row player is iformed of the outcome, whereas the colum player is ot. 2) Strategies: Mixed strategies are distributios over pure strategies for each player. Sice the row player is iformed of the state of the world, he is allowed a mixed strategy for each state k. Let x k m (S) deote the mixed strategy of the row player i state k at stage m ad also deote x k m(s) to be the probability that the colum player plays pure move s at stage m ad state k. I repeated play, this strategy ca be a fuctio of the actios of both players durig stages,..., m. Likewise, let y m (J) deote the mixed strategy of the colum player at stage m, which agai ca deped o the players actios over stages,..., m. Let x m = { x m,..., m} xk deote the collectio of the row player s mixed strategies for all states at stage m, ad x = {x,..., x m } deote mixed strategies over all states ad stages. Likewise, let y = {y,.., y m } deote the colum player s mixed strategies over all stages. Defie H m = S J m to be the set of possible histories where a elemet h m H m is a sequece (s, j ; s 2, j 2 ;...; s m, j m ) of the players moves i the first m stages of the game, ad let h I m deote the history of player s moves. Each player ca perfectly observe the moves of the other player. Therefore, the histories of each player at stage m are idetical. Behavioral strategies are mappigs from states ad histories to mixed strategies. Let σ : k h (S) deote a behavioral strategy of the row player ad deote ˆσ m k h(s) to be the probability that the colum player plays pure move s at stage m, history h, ad state k. The colum player s behavioral strategy ca oly deped o histories ad is deoted by τ : h (J). Defie σ = {σ,..., σ } to be the collectio of behavioral strategies of the row player over all stages. Likewise, defie τ = {τ,..., τ } to be the collectio of behavioral strategies of the colum player over all stages. Auma established that behavioral strategies ca be equivaletly represeted as mixed strategies 7. 3) Beliefs: Sice the colum player is ot iformed of the selected state k, he ca build beliefs o which state was selected. These beliefs are a fuctio of the iitial distributio p ad the observed moves of the row player. Therefore, the row player must carefully cosider his actios at each stage as they could potetially reveal the true state of the world to the colum player. I order to get a worse case estimate of how much iformatio the row player trasmits through his moves, he models the colum player as a Bayesia player ad assumes that the colum player has his mixed strategy. The updated belief p + is computed as p + (p, x, s) = pk x k (s) x(p, x, s), () where x(p, x, s) := pk x k (s) ad x k (s) is the probability of playig pure actio s at state k. 4) Payoffs: Let γm(σ, p τ)=e g m p,σ k,τ deote the expected payoff for the pair of behavioral strategies (σ, τ) at stage m. The payoff for the -stage game is the defied as γ p (σ, τ) = m= γm p (σ, τ). (2) Similarly the payoff for the λ-discouted game is defied as γ p λ (σ, τ) = m= B. Short-term vs log-term tradeoff λ( λ) m γm p (σ, τ). (3) The dyamic programmig recursive formula v + (p) = maxmi x k + x y M k y + s S x s v (p + (p, x, s)) (4) itroduced by Auma ad Maschler characterizes the value of the zero-sum repeated game of icomplete iformatio. Note that is a o-egative iteger ad for the case where = 0, the problem reduces to v (p) = max x mi y x k M k y, (5) which is the value of the -shot zero-sum icomplete iformatio A key iterpretatio of this formulatio is that it also serves as a model of the tradeoff betwee short-term gais ad the log-term iformatioal advatage. For each decisio x of the iformed player, the model evaluates the payoff for the curret stage, which is represeted by the expressio pk x k Mk y, ad the log-term cost for decisio x, which is represeted by ) s S x sv (p + (p, x, s). It is worth poitig out that the computatioal complexity for fidig the optimal decisio x ca be attributed to the cost of calculatig the log-term payoff. Sice the log-term payoff is a recursive optimizatio problem that grows with respect to the game legth, it ca be difficult to fid optimal strategies for games of arbitrary legth. This difficulty is because the umber of decisio variables i the recursive optimizatio problem grows expoetially with respect to the game legth. Possard ad Sori 8 showed that zero-sum

3 repeated games of icomplete iformatio ca be formulated as a liear program (LP) to compute optimal strategies. However, i the LP formulatio, it ca be immediately see that the computatioal complexity is also expoetial with respect to the game legth. Oe of the goals of this research is to preset a method to compute suboptimal policies with tight lower bouds, ad a feature of this method is that its computatioal complexity remais costat for games of ay legth. C. No-revealig strategies Revealig iformatio is defied as usig a differet mixed strategy i each state k at stage m. From (), it follows that a mixed strategy x m at stage m does ot chage the curret beliefs of the row player if x k m = x k m k, k. As a cosequece, ot revealig iformatio is equivalet to ot chagig the colum player s beliefs about the true state of the world. I stochastic games, it is possible for the colum player s beliefs to chage eve if the row player uses a idetical mixed strategy for each state k. A optimal o-revealig strategy ca be computed by solvig u(p) = max mi x k M k y, (6) x NR y where the set of o-revealig strategies is defied as NR = {x m x k m = xm k k, k K}. By playig a optimal orevealig strategy at each stage of the game, the row player ca guaratee a game payoff of u(p). Note that the optimal game payoff for the -stage game, v (p), is equal to u(p) oly uder special coditios. III. POLICY IMPROVEMENT STRATEGY We have previously stated i Sectio II-B that the difficulty i explicitly computig optimal strategies for a arbitrary game grows expoetially with respect to the umber of stages i the I this paper, we cosider suboptimal strategies whose computatioal complexity remais costat for games of arbitrary legth. Oe such strategy that we itroduce is called oe-time policy improvemet. I icomplete iformatio games, the row player always has the optio of ot usig his superior iformatio. A simple policy for the -stage game would be for the row player to ever use his superior iformatio. As oted i Sectio II-C, if he uses such a policy, he ca oly guaratee a payoff of at most u(p). I a oe-time policy improvemet strategy, the simple policy becomes the baselie policy. The key differece is that with the oe-time policy improvemet strategy, the row player is allowed to deviate from the simple policy at the first stage of the After the first stage, he is obliged to use the simple policy for the ext - stages. The guarateed payoff for the oe-time policy improvemet strategy for the -stage game is deoted by ˆv (p). Determiig ˆv (p) ca be achieved by solvig ˆv (p) = max mi x k x y Mk y + ( ) s S ( x s u p + (p, x, s)) (7) for. Defiitio : Let cav u(p) deote the poit-wise smallest cocave fuctio g o (K) satisfyig g(p) u(p) p (K). Defiitio 2: Deote the oe-time policy improvemet behavioral strategy by ˆσ, where for each ˆσ m 2, let ˆσ m(h)s k = ˆσ m k (h)s k, k K. Defiitio 3: A perpetual policy improvemet strategy is a strategy that is implemeted i a recedig horizo maer ad strategizes for the curret stage while assumig a orevealig policy for future stages. Theorem : Oe-time policy improvemet guaratees a payoff of at least cav u(p) for the -stage zero-sum repeated games of icomplete iformatio o oe-side. We will devote the remaider of this sectio to the proof of this theorem. The proof will proceed as follows. ) We will first prove i Propositio 3.5 that for ay iitial distributio p there exists a oe-time policy improvemet behavioral strategy whose lower boud is greater tha or equal to l L α lu(p l ), where L <, α (L), p, p l (K), ad p = l L α lp l. 2) Next we ote that cav u(p) = e E α eu(p e ), where α (E), E <, p, p e (K), ad p = e E α ep e. 3) It the follows that the oe-time policy improvemet behavioral strategy has a lower boud of cav u(p), which is tight. 4) Recall that behavioral strategies ca be equivaletly represeted as mixed strategies. Therefore a oe-time policy improvemet behavioral strategy ca be equivaletly represeted as a oe-time policy improvemet mixed strategy. As a cosequece ˆv (p) cav u(p) Corollary 2: Perpetual policy improvemet guaratees a payoff of at least cav u(p) for the -stage zero-sum repeated games of icomplete iformatio o -side. This ca be show by usig stadard dyamic programmig argumets regardig policy improvemet. Lemma 3.: 9 Let L be a fiite set ad p = l L α lp l with α (L), ad p,p l (K) for all l i L The there exists a trasitio probability µ from (K, p) to L such that P(l) = α l ad P( l) = p l, where P = p x is the probability iduced by p ad µ o K L : P(k, l) = µ k (l). Lemma 3.2: Fix p arbitrarily. Let p be represeted as p = l L α lp l, where L <, α (L), ad p,p l (K)

4 The their exists a strategy σ, which will be referred to as the splittig strategy, such that γ p (σ, τ) l L α l u(p l ) τ (8) ) Itroduce strategy σ as follows. Let σ l be the strategy that guaratees u(p l ) for the -stage Defie µ k p (l) = α k l l. If the state is k, use the lottery µ k, ad if the outcome is l, play σ l. 2) To get a lower boud o Player s payoff, assume eve that Player 2 is iformed upo l. He is the facig strategy σ l 3) By Lemma 3., this occurs with probability α l ad the coditioal probability o K is p l, hece the game is γ p l, so that γ (σ, p τ) l L α lu(p l ) τ Lemma 3.3: Cosider mixed strategy x, where x = { x, x 2,..., x K }. There exists a oe-time policy improvemet behavioral strategy ˆσ where ˆσ m(s) k = x k (s) k. ) Cosider the followig behavioral strategy. At stage m =, Player uses mixed strategy x k, where k is the true state of the world (i.e. ˆσ : k { } x k ). Whatever move s S that is realized i stage m =, Player plays s at each stage for the remaider of the game (i.e. σˆ : h m h where m 2 ad h m is player s history at stage m ). 2) Clearly ˆσ (s) k = x k (s) by defiitio. 3) Cosider stage 2. Suppose the probability of playig move s i state k at stage is α. Recall that whatever move that the row player realizes i stage is played for the rest of the It follows that if the state of world is state k, the probability of playig move s i stage 2 is also α. 4) The same argumet ca be applied to stage m. 5) We have ow established that the followig equality ˆσ m k (s) = xk (s) holds for all m, where ˆσ is the oetime policy improvemet behavioral strategy. Propositio 3.4: Fix p arbitrarily. Let p be represeted as p = l L α lp l, where L <, α (L), ad p,p l (K). Suppose we costruct a behavioral strategy σ as follows. Defie σ l to be the optimal behavioral strategy to the o-revealig -stage game u(p l ). Let σ k = l L µk (l)σl k defie the mixed behavioral strategy for the -stage game, where µ k p (l) = α k l l. The lower boud for the payoff of behavioral strategy σ is l L α lu(p l ). Explicitly γ p ( σ, τ) = = m= l L E g m k, σk,τ m= µ k (l)e g m k,σ k l,τ (9) l L α l u(p l ) τ where τ is the behavioral strategy of player 2 We will first establish that the splittig strategy has the equivalet payoff of mixed behavioral strategy σ for arbitrary behavioral strategies τ of the colum player. We the ote that the splittig strategy has a lower boud of l L α lu(p l ). We coclude by makig the followig observatio. Sice the splittig strategy has a idetical payoff as strategy σ for each strategy τ, it also has the same lower boud l L α lu(p l ). ) Recall the splittig strategy as defied i Lemma ) The payoff for this strategy ca be expressed as follows: µ k (l) l L m= = p k µ k (l) l L m= = m= l L E g m k,σ k l,τ E g m k,σ k l,τ µ k (l)e g m k,σ k l,τ (0) 3) Observe that the payoff for strategy σ i (9) is equivalet to the payoff for the splittig strategy σ i (0) for arbitrary τ. Explicitly γ p ( σ, τ) = γ p (σ, τ) τ. 4) Recall Lemma 3.2, which states that the splittig strategy has a payoff with lower boud l L α lu(p l ). 5) Coclusio: Give a behavioral strategy τ of player 2, we have established that the payoff of strategy σ is equivalet to that of the splittig strategy. We show that the payoff of splittig strategy is lower bouded by l L α lu(p l ). This implies that strategy σ also has this lower boud. Propositio 3.5: There exists a oe-time policy improvemet strategy ˆσ such that the followig iequality holds. γ p (ˆσ, τ) l L α l u(p l ) ) Recall the mixed behavioral strategy σ as defied i Propositio 3.4, where σ k = l L µk (l)σl k. 2) Note first that sice σ l is a optimal o-revealig strategy, behavioral strategy σl k is the same for every state k (i.e. σl k = σl k k, k ). Furthermore, the NR mixed strategy x l is costat for each stage 3) Therefore σl k = x l k,. 4) Defie α k = l L µk (l)x l. 5) We ca the express σ as σ : K H α k, which is a statioary strategy. 6) Defie x as follows: x = {α, α 2,..., α K }.

5 7) By Lemma 3.3, there exists a oe-time policy improvemet strategy ˆσ such that ˆσ k m (s) = xk (s) s, m 8) We have ow established that γ p (ˆσ, τ) = γ p ( σ, τ) τ 9) Therefore it follows that γ p (ˆσ, τ) = γp ( σ, τ) l L α l u(p l ) τ () IV. POLICY IMPROVEMENT IN INFINITE HORIZON GAMES I the previous sectio, we have show that the oe-time policy improvemet strategy guaratees cav u(p) for the - stage I this sectio we will show that the guaratee also holds i the ifiite horizo games. The guarateed payoff for the oe-time policy improvemet strategy i the λ-discouted ifiite horizo games ca be computed by solvig { maxmi λ x k x y Mk y + ( λ) ( )} x s u p + (p, x, s) s S (2) for λ (0, ). Theorem 3: Oe-time policy improvemet guaratees a payoff of at least cav u(p) for the λ-discouted ifiite horizo zero-sum repeated games of icomplete iformatio o oe-side. We will show the existece of a oe-time policiy improvemet strategy that guaratees at least cav u(p). ) Fix λ (0, ) arbitrarily. 2) There exists N s.t. λ > N. Cosider this N. 3) Let λ = N, the ˆv λ(p) = ˆv N (p). 4) Ivokig Theorem yields ˆv λ(p) = ˆv N (p) cav u(p) 5) Claim: The optimal oe-time policy improvemet strategy for the discouted game ˆv λ(p) also guaratees at least cav u(p) for ˆv λ (p). Sice ˆv λ(p) cav u(p) ad s S x su(p + x, s) cav u(p), it follows that the optimal stage strategy x for ˆv λ(p) has the followig lower boud: x k M k y cav u(p). Note that λ > λ. Therefore λ x k M k y + ( λ) ( ) x s u p + (p, x, s) s S λ x k M k y + ( λ) ( ) x s u p + (p, x, s) s S cav u(p). Remark: If s S x su(p + (p, x, s)) = cav u(p) ad x k M k y = cav u(p) the ˆv λ (p) = ˆv λ(p) = cav u(p). 6) By usig the optimal stage strategy x obtaied from ˆv λ(p) ad playig a optimal o-revealig strategy thereafter, we have costructed a oe-time policy update strategy for ˆv λ (p) that guaratees cav u(p). Theorem 4: Oe-time policy improvemet is a optimal strategy for ifiitely-repeated zero-sum games of icomplete iformatio o oe-side. Usig a argumet similar to that of Theorem 3, oe ca establish that there exists a oe-time policy improvemet strategy that guaratees a payoff of at least cav u(p). Note that cav u(p) is the optimal payoff for the ifiitelyrepeated V. LP FORMULATION A oe-time policy-improvemet strategy that guaratees cav u(p) ca be computed by solvig a liear programig problem, ad the computatioal complexity of the liear program is costat with respect to the umber of stages of the The followig outlies a procedure to costruct the appropriate liear program. ) Let Σ deote the set of pure oe-time policyimprovemet behavioral strategies (i.e. σ : k s, σ m : h I s m 2, ad σ m = σ m m, m ) 2) Let T deote the set of pure strategies for the uiformed player. (i.e. τ : j, σ m : h I j m 2, ad τ m = τ m m, m ) 3) Note that the size of the strategy sets Σ ad T are ivariat with respect to the umber of stages of the 4) Observe that sice a oe-time policy-improvemet strategy is used, the strategy ad the correspodig payoff remais costat for m 2, so that γm( σ, p τ) = γ p 2 ( σ, τ) m 2. 5) Therefore, the game payoff for the strategy pair is γ p λ ( σi, τ l ) = λγ p ( σi, τ l ) + ( λ)γ p 2 ( σi, τ l ). For the -stage game, set λ =. 6) Cosider a matrix M, where elemet (i, l) deotes the game payoff γ p λ ( σi, τ l ) for the strategy pair ( σ i, τ l ). 7) Sice this is a zero sum game with fiite strategies for each player ad a payoff matrix M, a classic zero-sum game result ca be used to solve this zero-sum game as a LP. As a cosequece, a perpetual policy-improvemet strategy that guaratees cav u(p) ca be computed by solvig a liear programig problem at each stage of the game, ad the computatioal complexity of the liear program is costat with respect to the umber of stages of the The followig outlies a procedure for this costructio. ) Note that the stage behavioral strategy σ is the oetime policy-improvemet strategy that ca be computed by solvig a LP. 2) Cosider stage 2. Sice this is a game of perfect recall, the behavioral strategy σ ca be equivaletly represeted as a mixed strategy x.

6 3) A move of player was realized i stage, ad sice the mixed strategy x is kow, the posterior probability p + ca be computed. 4) Compute the stage 2 strategy by solvig the optimizatio problem ˆv λ (p + ). If it is a -stage game, set λ = N. 5) The same techiques that was used to compute the Stage strategy by solvig a LP, ca be used for Stage 2. 6) By a similar argumet, the Stage m strategy ca be computed by solvig a LP. VI. SIMULATION: CYBER SECURITY EXAMPLE The etwork admiistrator, the row player, maages two web applicatios wapp ad wapp 2 (i.e. ad remote logi). Each web applicatio is ru o its ow dedicated server (H ad H 2 ). The attacker would like to prevet users from accessig the web applicatios via a Deial of Service (DOS) attack. I order to help mitigate DOS attacks, a spare server (H spare ) ca be dyamically cofigured daily to also ru either wapp or wapp 2. We assume that the attacker ca observe which web applicatio the etwork admi decides to ru o the spare server. The attacker has a choice of which web applicatio to execute a Deial of Service. wapp wapp 2 wapp 0 wapp State α wapp wapp 2 wapp 2 0 wap 2 0 State β I state α, all of the legitimate users o the etwork are usig oly wapp. Coversely i state β, they are all oly usig wapp 2. The payoff for each of the states are i terms of of quality of service. Suppose that p α =.5 ad p β =.5, where p α deotes the iitial probability of beig i state α ad let p = (p α, p β ). I this example, we will set the umber of stages of the game to 2. If the etwork admi. plays his domiat strategy o day, he will have fully revealed the state of the etwork to the attacker o day 2. The domiat strategy for the etwork admi. is to ru web applicatio o the backup server i state α ad to ru web applicatio 2 o the backup server i β. The etwork admi. will achieve a expected payoff of 0.5 o day ad a payoff of 0 o day 2 for a total payoff of.25. If the etwork admi uses a -time policy improvemet strategy, will he do better? Solvig umerically we compute a mixed strategy of x α = (0.50,.50) ad x β = (.50,.50), with a expected game payoff of over the two day period. Suppose, o day 2 that the etwork admi. decides to agai improve upo the baselie policy of playig orevealig. By exploitig his iformatio o day 2, he ca yield a expected stage payoff of.50 ad a game payoff of VII. CONCLUSION Computig optimal strategies for zero-sum repeated games with asymmetric iformatio ca be computatioally prohibitive. Much of the curret work to address this issue has bee limited to special cases. I this work, we preset policy improvemet methods to compute suboptimal strategies that have tight lower bouds. We show that the policy improvemet strategy ca be computed by solvig a liear program, ad the computatioal complexity of the liear program remais costat with respect to the umber of stages i the I this paper, we focused o computatioal results for repeated games, where the state of the world remais fixed oce it has bee selected by ature. Repeated games are a special case of stochastic games. I stochastic games, the state of the world ca chage at each stage of the game ad the state trasitios ca be a fuctio of the actios of the players. We would like to cosider computatioal results for stochastic games ad also exted our curret results to stochastic games. REFERENCES R. J. Auma ad M. Maschler, Repeated Games with Icomplete Iformatio. MIT Press, V. C. Domasky ad V. L. Kreps, Evetually revealig repeated games of icomplete iformatio, Iteratioal Joural of Game Theory, vol. 23, pp , M. Heur, Optimal strategies for the uiformed player, Iteratioal Joural of Game Theory, vol. 20, pp. 33 5, S. Zamir, O the relatio betwee fiitely ad ifiitely repeated games with icomplete iformatio, Iteratioal Joural of Game Theory, vol. 23, pp , A. Gilpi ad T. Sadholm, Solvig two-perso zero-sum repeated games of icomplete iformatio, Iteratioal Joit Coferece o Autoomous Agets ad Multiaget Systems, vol. 2, pp , S. Zamir, Repeated games of icomplete iformatio: Zero-sum, Hadbook of Game Theory, vol., pp , R. Auma, Mixed ad behavior strategies i ifiite extesive games. Priceto Uiversity, J. Possard ad S. Sori, The l-p formulatio of fiite zero-sum games with icomplete iformatio, Iteratioal Joural of Game Theory, vol. 9, pp , S. Sori, A First Course o Zero-Sum Repeated Games. Spriger, D. Blackwell, A aalog of the miimax theorem for vector payoffs. Pacific Joural of Mathematics, vol. 956, o., pp. 8, 956. Y. Freud ad R. E. Schapire, Game theory, o-lie predictio ad boostig, i Proceedigs of the ith aual coferece o Computatioal learig theory, ser. COLT 96. New York, NY, USA: ACM, 996, pp Olie. Available: 2 D. Roseberg, E. Sola, ad N. Vieille, Stochastic games with a sigle cotroller ad icomplete iformatio, Northwester Uiversity, Ceter for Mathematical Studies i Ecoomics ad Maagemet Sciece, Tech. Rep. 346, May J.-F. Mertes ad S. Zamir, The value of two-perso zero-sum repeated games with lack of iformatio o both sides, i Istitute of Mathematics, The Hebrew Uiversity of Jerusalem, 970, pp J.-F. Mertes, The speed of covergece i repeated games with icomplete iformatio o oe side, Uiversit catholique de Louvai, Ceter for Operatios Research ad Ecoometrics (CORE), Tech. Rep , Ja. 995.

1 Estimating sensitivities

1 Estimating sensitivities Copyright c 27 by Karl Sigma 1 Estimatig sesitivities Whe estimatig the Greeks, such as the, the geeral problem ivolves a radom variable Y = Y (α) (such as a discouted payoff) that depeds o a parameter

More information

A New Constructive Proof of Graham's Theorem and More New Classes of Functionally Complete Functions

A New Constructive Proof of Graham's Theorem and More New Classes of Functionally Complete Functions A New Costructive Proof of Graham's Theorem ad More New Classes of Fuctioally Complete Fuctios Azhou Yag, Ph.D. Zhu-qi Lu, Ph.D. Abstract A -valued two-variable truth fuctio is called fuctioally complete,

More information

The Time Value of Money in Financial Management

The Time Value of Money in Financial Management The Time Value of Moey i Fiacial Maagemet Muteau Irea Ovidius Uiversity of Costata irea.muteau@yahoo.com Bacula Mariaa Traia Theoretical High School, Costata baculamariaa@yahoo.com Abstract The Time Value

More information

A New Approach to Obtain an Optimal Solution for the Assignment Problem

A New Approach to Obtain an Optimal Solution for the Assignment Problem Iteratioal Joural of Sciece ad Research (IJSR) ISSN (Olie): 231-7064 Idex Copericus Value (2013): 6.14 Impact Factor (2015): 6.31 A New Approach to Obtai a Optimal Solutio for the Assigmet Problem A. Seethalakshmy

More information

Sequences and Series

Sequences and Series Sequeces ad Series Matt Rosezweig Cotets Sequeces ad Series. Sequeces.................................................. Series....................................................3 Rudi Chapter 3 Exercises........................................

More information

SUPPLEMENTAL MATERIAL

SUPPLEMENTAL MATERIAL A SULEMENTAL MATERIAL Theorem (Expert pseudo-regret upper boud. Let us cosider a istace of the I-SG problem ad apply the FL algorithm, where each possible profile A is a expert ad receives, at roud, a

More information

5. Best Unbiased Estimators

5. Best Unbiased Estimators Best Ubiased Estimators http://www.math.uah.edu/stat/poit/ubiased.xhtml 1 of 7 7/16/2009 6:13 AM Virtual Laboratories > 7. Poit Estimatio > 1 2 3 4 5 6 5. Best Ubiased Estimators Basic Theory Cosider agai

More information

Rafa l Kulik and Marc Raimondo. University of Ottawa and University of Sydney. Supplementary material

Rafa l Kulik and Marc Raimondo. University of Ottawa and University of Sydney. Supplementary material Statistica Siica 009: Supplemet 1 L p -WAVELET REGRESSION WITH CORRELATED ERRORS AND INVERSE PROBLEMS Rafa l Kulik ad Marc Raimodo Uiversity of Ottawa ad Uiversity of Sydey Supplemetary material This ote

More information

Statistics for Economics & Business

Statistics for Economics & Business Statistics for Ecoomics & Busiess Cofidece Iterval Estimatio Learig Objectives I this chapter, you lear: To costruct ad iterpret cofidece iterval estimates for the mea ad the proportio How to determie

More information

Neighboring Optimal Solution for Fuzzy Travelling Salesman Problem

Neighboring Optimal Solution for Fuzzy Travelling Salesman Problem Iteratioal Joural of Egieerig Research ad Geeral Sciece Volume 2, Issue 4, Jue-July, 2014 Neighborig Optimal Solutio for Fuzzy Travellig Salesma Problem D. Stephe Digar 1, K. Thiripura Sudari 2 1 Research

More information

Productivity depending risk minimization of production activities

Productivity depending risk minimization of production activities Productivity depedig risk miimizatio of productio activities GEORGETTE KANARACHOU, VRASIDAS LEOPOULOS Productio Egieerig Sectio Natioal Techical Uiversity of Athes, Polytechioupolis Zografou, 15780 Athes

More information

Combining imperfect data, and an introduction to data assimilation Ross Bannister, NCEO, September 2010

Combining imperfect data, and an introduction to data assimilation Ross Bannister, NCEO, September 2010 Combiig imperfect data, ad a itroductio to data assimilatio Ross Baister, NCEO, September 00 rbaister@readigacuk The probability desity fuctio (PDF prob that x lies betwee x ad x + dx p (x restrictio o

More information

The Limit of a Sequence (Brief Summary) 1

The Limit of a Sequence (Brief Summary) 1 The Limit of a Sequece (Brief Summary). Defiitio. A real umber L is a it of a sequece of real umbers if every ope iterval cotaiig L cotais all but a fiite umber of terms of the sequece. 2. Claim. A sequece

More information

Solution to Tutorial 6

Solution to Tutorial 6 Solutio to Tutorial 6 2012/2013 Semester I MA4264 Game Theory Tutor: Xiag Su October 12, 2012 1 Review Static game of icomplete iformatio The ormal-form represetatio of a -player static Bayesia game: {A

More information

FINM6900 Finance Theory How Is Asymmetric Information Reflected in Asset Prices?

FINM6900 Finance Theory How Is Asymmetric Information Reflected in Asset Prices? FINM6900 Fiace Theory How Is Asymmetric Iformatio Reflected i Asset Prices? February 3, 2012 Referece S. Grossma, O the Efficiecy of Competitive Stock Markets where Traders Have Diverse iformatio, Joural

More information

Subject CT1 Financial Mathematics Core Technical Syllabus

Subject CT1 Financial Mathematics Core Technical Syllabus Subject CT1 Fiacial Mathematics Core Techical Syllabus for the 2018 exams 1 Jue 2017 Subject CT1 Fiacial Mathematics Core Techical Aim The aim of the Fiacial Mathematics subject is to provide a groudig

More information

1 Random Variables and Key Statistics

1 Random Variables and Key Statistics Review of Statistics 1 Radom Variables ad Key Statistics Radom Variable: A radom variable is a variable that takes o differet umerical values from a sample space determied by chace (probability distributio,

More information

Lecture 9: The law of large numbers and central limit theorem

Lecture 9: The law of large numbers and central limit theorem Lecture 9: The law of large umbers ad cetral limit theorem Theorem.4 Let X,X 2,... be idepedet radom variables with fiite expectatios. (i) (The SLLN). If there is a costat p [,2] such that E X i p i i=

More information

Unbiased estimators Estimators

Unbiased estimators Estimators 19 Ubiased estimators I Chapter 17 we saw that a dataset ca be modeled as a realizatio of a radom sample from a probability distributio ad that quatities of iterest correspod to features of the model distributio.

More information

point estimator a random variable (like P or X) whose values are used to estimate a population parameter

point estimator a random variable (like P or X) whose values are used to estimate a population parameter Estimatio We have oted that the pollig problem which attempts to estimate the proportio p of Successes i some populatio ad the measuremet problem which attempts to estimate the mea value µ of some quatity

More information

14.30 Introduction to Statistical Methods in Economics Spring 2009

14.30 Introduction to Statistical Methods in Economics Spring 2009 MIT OpeCourseWare http://ocwmitedu 430 Itroductio to Statistical Methods i Ecoomics Sprig 009 For iformatio about citig these materials or our Terms of Use, visit: http://ocwmitedu/terms 430 Itroductio

More information

CAPITAL PROJECT SCREENING AND SELECTION

CAPITAL PROJECT SCREENING AND SELECTION CAPITAL PROJECT SCREEIG AD SELECTIO Before studyig the three measures of ivestmet attractiveess, we will review a simple method that is commoly used to scree capital ivestmets. Oe of the primary cocers

More information

Hopscotch and Explicit difference method for solving Black-Scholes PDE

Hopscotch and Explicit difference method for solving Black-Scholes PDE Mälardale iversity Fiacial Egieerig Program Aalytical Fiace Semiar Report Hopscotch ad Explicit differece method for solvig Blac-Scholes PDE Istructor: Ja Röma Team members: A Gog HaiLog Zhao Hog Cui 0

More information

Summary. Recap. Last Lecture. .1 If you know MLE of θ, can you also know MLE of τ(θ) for any function τ?

Summary. Recap. Last Lecture. .1 If you know MLE of θ, can you also know MLE of τ(θ) for any function τ? Last Lecture Biostatistics 60 - Statistical Iferece Lecture Cramer-Rao Theorem Hyu Mi Kag February 9th, 03 If you kow MLE of, ca you also kow MLE of τ() for ay fuctio τ? What are plausible ways to compare

More information

Section Mathematical Induction and Section Strong Induction and Well-Ordering

Section Mathematical Induction and Section Strong Induction and Well-Ordering Sectio 4.1 - Mathematical Iductio ad Sectio 4. - Strog Iductio ad Well-Orderig A very special rule of iferece! Defiitio: A set S is well ordered if every subset has a least elemet. Note: [0, 1] is ot well

More information

Research Article The Probability That a Measurement Falls within a Range of n Standard Deviations from an Estimate of the Mean

Research Article The Probability That a Measurement Falls within a Range of n Standard Deviations from an Estimate of the Mean Iteratioal Scholarly Research Network ISRN Applied Mathematics Volume 0, Article ID 70806, 8 pages doi:0.540/0/70806 Research Article The Probability That a Measuremet Falls withi a Rage of Stadard Deviatios

More information

Exam 1 Spring 2015 Statistics for Applications 3/5/2015

Exam 1 Spring 2015 Statistics for Applications 3/5/2015 8.443 Exam Sprig 05 Statistics for Applicatios 3/5/05. Log Normal Distributio: A radom variable X follows a Logormal(θ, σ ) distributio if l(x) follows a Normal(θ, σ ) distributio. For the ormal radom

More information

The Valuation of the Catastrophe Equity Puts with Jump Risks

The Valuation of the Catastrophe Equity Puts with Jump Risks The Valuatio of the Catastrophe Equity Puts with Jump Risks Shih-Kuei Li Natioal Uiversity of Kaohsiug Joit work with Chia-Chie Chag Outlie Catastrophe Isurace Products Literatures ad Motivatios Jump Risk

More information

A Bayesian perspective on estimating mean, variance, and standard-deviation from data

A Bayesian perspective on estimating mean, variance, and standard-deviation from data Brigham Youg Uiversity BYU ScholarsArchive All Faculty Publicatios 006--05 A Bayesia perspective o estimatig mea, variace, ad stadard-deviatio from data Travis E. Oliphat Follow this ad additioal works

More information

Estimating Proportions with Confidence

Estimating Proportions with Confidence Aoucemets: Discussio today is review for midterm, o credit. You may atted more tha oe discussio sectio. Brig sheets of otes ad calculator to midterm. We will provide Scatro form. Homework: (Due Wed Chapter

More information

A DOUBLE INCREMENTAL AGGREGATED GRADIENT METHOD WITH LINEAR CONVERGENCE RATE FOR LARGE-SCALE OPTIMIZATION

A DOUBLE INCREMENTAL AGGREGATED GRADIENT METHOD WITH LINEAR CONVERGENCE RATE FOR LARGE-SCALE OPTIMIZATION A DOUBLE INCREMENTAL AGGREGATED GRADIENT METHOD WITH LINEAR CONVERGENCE RATE FOR LARGE-SCALE OPTIMIZATION Arya Mokhtari, Mert Gürbüzbalaba, ad Alejadro Ribeiro Departmet of Electrical ad Systems Egieerig,

More information

Online appendices from Counterparty Risk and Credit Value Adjustment a continuing challenge for global financial markets by Jon Gregory

Online appendices from Counterparty Risk and Credit Value Adjustment a continuing challenge for global financial markets by Jon Gregory Olie appedices from Couterparty Risk ad Credit Value Adjustmet a APPENDIX 8A: Formulas for EE, PFE ad EPE for a ormal distributio Cosider a ormal distributio with mea (expected future value) ad stadard

More information

. (The calculated sample mean is symbolized by x.)

. (The calculated sample mean is symbolized by x.) Stat 40, sectio 5.4 The Cetral Limit Theorem otes by Tim Pilachowski If you have t doe it yet, go to the Stat 40 page ad dowload the hadout 5.4 supplemet Cetral Limit Theorem. The homework (both practice

More information

Bayes Estimator for Coefficient of Variation and Inverse Coefficient of Variation for the Normal Distribution

Bayes Estimator for Coefficient of Variation and Inverse Coefficient of Variation for the Normal Distribution Iteratioal Joural of Statistics ad Systems ISSN 0973-675 Volume, Number 4 (07, pp. 7-73 Research Idia Publicatios http://www.ripublicatio.com Bayes Estimator for Coefficiet of Variatio ad Iverse Coefficiet

More information

Research Article The Average Lower Connectivity of Graphs

Research Article The Average Lower Connectivity of Graphs Applied Mathematics, Article ID 807834, 4 pages http://dx.doi.org/10.1155/2014/807834 Research Article The Average Lower Coectivity of Graphs Ersi Asla Turgutlu Vocatioal Traiig School, Celal Bayar Uiversity,

More information

Monopoly vs. Competition in Light of Extraction Norms. Abstract

Monopoly vs. Competition in Light of Extraction Norms. Abstract Moopoly vs. Competitio i Light of Extractio Norms By Arkadi Koziashvili, Shmuel Nitza ad Yossef Tobol Abstract This ote demostrates that whether the market is competitive or moopolistic eed ot be the result

More information

Subject CT5 Contingencies Core Technical. Syllabus. for the 2011 Examinations. The Faculty of Actuaries and Institute of Actuaries.

Subject CT5 Contingencies Core Technical. Syllabus. for the 2011 Examinations. The Faculty of Actuaries and Institute of Actuaries. Subject CT5 Cotigecies Core Techical Syllabus for the 2011 Examiatios 1 Jue 2010 The Faculty of Actuaries ad Istitute of Actuaries Aim The aim of the Cotigecies subject is to provide a groudig i the mathematical

More information

DESCRIPTION OF MATHEMATICAL MODELS USED IN RATING ACTIVITIES

DESCRIPTION OF MATHEMATICAL MODELS USED IN RATING ACTIVITIES July 2014, Frakfurt am Mai. DESCRIPTION OF MATHEMATICAL MODELS USED IN RATING ACTIVITIES This documet outlies priciples ad key assumptios uderlyig the ratig models ad methodologies of Ratig-Agetur Expert

More information

ad covexity Defie Macaulay duratio D Mod = r 1 = ( CF i i k (1 + r k) i ) (1.) (1 + r k) C = ( r ) = 1 ( CF i i(i + 1) (1 + r k) i+ k ) ( ( i k ) CF i

ad covexity Defie Macaulay duratio D Mod = r 1 = ( CF i i k (1 + r k) i ) (1.) (1 + r k) C = ( r ) = 1 ( CF i i(i + 1) (1 + r k) i+ k ) ( ( i k ) CF i Fixed Icome Basics Cotets Duratio ad Covexity Bod Duratios ar Rate, Spot Rate, ad Forward Rate Flat Forward Iterpolatio Forward rice/yield, Carry, Roll-Dow Example Duratio ad Covexity For a series of cash

More information

CHAPTER 2 PRICING OF BONDS

CHAPTER 2 PRICING OF BONDS CHAPTER 2 PRICING OF BONDS CHAPTER SUARY This chapter will focus o the time value of moey ad how to calculate the price of a bod. Whe pricig a bod it is ecessary to estimate the expected cash flows ad

More information

The material in this chapter is motivated by Experiment 9.

The material in this chapter is motivated by Experiment 9. Chapter 5 Optimal Auctios The material i this chapter is motivated by Experimet 9. We wish to aalyze the decisio of a seller who sets a reserve price whe auctioig off a item to a group of bidders. We begi

More information

43. A 000 par value 5-year bod with 8.0% semiaual coupos was bought to yield 7.5% covertible semiaually. Determie the amout of premium amortized i the 6 th coupo paymet. (A).00 (B).08 (C).5 (D).5 (E).34

More information

Minhyun Yoo, Darae Jeong, Seungsuk Seo, and Junseok Kim

Minhyun Yoo, Darae Jeong, Seungsuk Seo, and Junseok Kim Hoam Mathematical J. 37 (15), No. 4, pp. 441 455 http://dx.doi.org/1.5831/hmj.15.37.4.441 A COMPARISON STUDY OF EXPLICIT AND IMPLICIT NUMERICAL METHODS FOR THE EQUITY-LINKED SECURITIES Mihyu Yoo, Darae

More information

Chapter 8: Estimation of Mean & Proportion. Introduction

Chapter 8: Estimation of Mean & Proportion. Introduction Chapter 8: Estimatio of Mea & Proportio 8.1 Estimatio, Poit Estimate, ad Iterval Estimate 8.2 Estimatio of a Populatio Mea: σ Kow 8.3 Estimatio of a Populatio Mea: σ Not Kow 8.4 Estimatio of a Populatio

More information

We consider the planning of production over the infinite horizon in a system with timevarying

We consider the planning of production over the infinite horizon in a system with timevarying Ifiite Horizo Productio Plaig i Time-varyig Systems with Covex Productio ad Ivetory Costs Robert L. Smith j Rachel Q. Zhag Departmet of Idustrial ad Operatios Egieerig, Uiversity of Michiga, A Arbor, Michiga

More information

Binomial Model. Stock Price Dynamics. The Key Idea Riskless Hedge

Binomial Model. Stock Price Dynamics. The Key Idea Riskless Hedge Biomial Model Stock Price Dyamics The value of a optio at maturity depeds o the price of the uderlyig stock at maturity. The value of the optio today depeds o the expected value of the optio at maturity

More information

AUTOMATIC GENERATION OF FUZZY PAYOFF MATRIX IN GAME THEORY

AUTOMATIC GENERATION OF FUZZY PAYOFF MATRIX IN GAME THEORY AUTOMATIC GENERATION OF FUZZY PAYOFF MATRIX IN GAME THEORY Dr. Farha I. D. Al Ai * ad Dr. Muhaed Alfarras ** * College of Egieerig ** College of Coputer Egieerig ad scieces Gulf Uiversity * Dr.farha@gulfuiversity.et;

More information

Department of Mathematics, S.R.K.R. Engineering College, Bhimavaram, A.P., India 2

Department of Mathematics, S.R.K.R. Engineering College, Bhimavaram, A.P., India 2 Skewess Corrected Cotrol charts for two Iverted Models R. Subba Rao* 1, Pushpa Latha Mamidi 2, M.S. Ravi Kumar 3 1 Departmet of Mathematics, S.R.K.R. Egieerig College, Bhimavaram, A.P., Idia 2 Departmet

More information

Inferential Statistics and Probability a Holistic Approach. Inference Process. Inference Process. Chapter 8 Slides. Maurice Geraghty,

Inferential Statistics and Probability a Holistic Approach. Inference Process. Inference Process. Chapter 8 Slides. Maurice Geraghty, Iferetial Statistics ad Probability a Holistic Approach Chapter 8 Poit Estimatio ad Cofidece Itervals This Course Material by Maurice Geraghty is licesed uder a Creative Commos Attributio-ShareAlike 4.0

More information

ECON 5350 Class Notes Maximum Likelihood Estimation

ECON 5350 Class Notes Maximum Likelihood Estimation ECON 5350 Class Notes Maximum Likelihood Estimatio 1 Maximum Likelihood Estimatio Example #1. Cosider the radom sample {X 1 = 0.5, X 2 = 2.0, X 3 = 10.0, X 4 = 1.5, X 5 = 7.0} geerated from a expoetial

More information

Institute of Actuaries of India Subject CT5 General Insurance, Life and Health Contingencies

Institute of Actuaries of India Subject CT5 General Insurance, Life and Health Contingencies Istitute of Actuaries of Idia Subject CT5 Geeral Isurace, Life ad Health Cotigecies For 2017 Examiatios Aim The aim of the Cotigecies subject is to provide a groudig i the mathematical techiques which

More information

On Regret and Options - A Game Theoretic Approach for Option Pricing

On Regret and Options - A Game Theoretic Approach for Option Pricing O Regret ad Optios - A Game Theoretic Approach for Optio Pricig Peter M. DeMarzo, Ila Kremer ad Yishay Masour Staford Graduate School of Busiess ad Tel Aviv Uiversity October, 005 This Revisio: 9/7/05

More information

Overlapping Generations

Overlapping Generations Eco. 53a all 996 C. Sims. troductio Overlappig Geeratios We wat to study how asset markets allow idividuals, motivated by the eed to provide icome for their retiremet years, to fiace capital accumulatio

More information

Linear Programming for Portfolio Selection Based on Fuzzy Decision-Making Theory

Linear Programming for Portfolio Selection Based on Fuzzy Decision-Making Theory The Teth Iteratioal Symposium o Operatios Research ad Its Applicatios (ISORA 2011 Duhuag, Chia, August 28 31, 2011 Copyright 2011 ORSC & APORC, pp. 195 202 Liear Programmig for Portfolio Selectio Based

More information

A random variable is a variable whose value is a numerical outcome of a random phenomenon.

A random variable is a variable whose value is a numerical outcome of a random phenomenon. The Practice of Statistics, d ed ates, Moore, ad Stares Itroductio We are ofte more iterested i the umber of times a give outcome ca occur tha i the possible outcomes themselves For example, if we toss

More information

Reinforcement Learning

Reinforcement Learning Reiforcemet Learig Ala Fer * Based i part o slides by Daiel Weld So far. Give a MDP model we kow how to fid optimal policies (for moderately-sized MDPs) Value Iteratio or Policy Iteratio Give just a simulator

More information

EXERCISE - BINOMIAL THEOREM

EXERCISE - BINOMIAL THEOREM BINOMIAL THOEREM / EXERCISE - BINOMIAL THEOREM LEVEL I SUBJECTIVE QUESTIONS. Expad the followig expressios ad fid the umber of term i the expasio of the expressios. (a) (x + y) 99 (b) ( + a) 9 + ( a) 9

More information

We learned: $100 cash today is preferred over $100 a year from now

We learned: $100 cash today is preferred over $100 a year from now Recap from Last Week Time Value of Moey We leared: $ cash today is preferred over $ a year from ow there is time value of moey i the form of willigess of baks, busiesses, ad people to pay iterest for its

More information

Marking Estimation of Petri Nets based on Partial Observation

Marking Estimation of Petri Nets based on Partial Observation Markig Estimatio of Petri Nets based o Partial Observatio Alessadro Giua ( ), Jorge Júlvez ( ) 1, Carla Seatzu ( ) ( ) Dip. di Igegeria Elettrica ed Elettroica, Uiversità di Cagliari, Italy {giua,seatzu}@diee.uica.it

More information

INTERVAL GAMES. and player 2 selects 1, then player 2 would give player 1 a payoff of, 1) = 0.

INTERVAL GAMES. and player 2 selects 1, then player 2 would give player 1 a payoff of, 1) = 0. INTERVAL GAMES ANTHONY MENDES Let I ad I 2 be itervals of real umbers. A iterval game is played i this way: player secretly selects x I ad player 2 secretly ad idepedetly selects y I 2. After x ad y are

More information

Chapter 8. Confidence Interval Estimation. Copyright 2015, 2012, 2009 Pearson Education, Inc. Chapter 8, Slide 1

Chapter 8. Confidence Interval Estimation. Copyright 2015, 2012, 2009 Pearson Education, Inc. Chapter 8, Slide 1 Chapter 8 Cofidece Iterval Estimatio Copyright 2015, 2012, 2009 Pearso Educatio, Ic. Chapter 8, Slide 1 Learig Objectives I this chapter, you lear: To costruct ad iterpret cofidece iterval estimates for

More information

Parametric Density Estimation: Maximum Likelihood Estimation

Parametric Density Estimation: Maximum Likelihood Estimation Parametric Desity stimatio: Maimum Likelihood stimatio C6 Today Itroductio to desity estimatio Maimum Likelihood stimatio Itroducto Bayesia Decisio Theory i previous lectures tells us how to desig a optimal

More information

Lecture 4: Probability (continued)

Lecture 4: Probability (continued) Lecture 4: Probability (cotiued) Desity Curves We ve defied probabilities for discrete variables (such as coi tossig). Probabilities for cotiuous or measuremet variables also are evaluated usig relative

More information

Mine Closure Risk Assessment A living process during the operation

Mine Closure Risk Assessment A living process during the operation Tailigs ad Mie Waste 2017 Baff, Alberta, Caada Mie Closure Risk Assessmet A livig process durig the operatio Cristiá Marambio Golder Associates Closure chroology Chilea reality Gov. 1997 Evirometal basis

More information

Game Theory. Lecture Notes By Y. Narahari. Department of Computer Science and Automation Indian Institute of Science Bangalore, India July 2012

Game Theory. Lecture Notes By Y. Narahari. Department of Computer Science and Automation Indian Institute of Science Bangalore, India July 2012 Game Theory Lecture Notes By Y. Narahari Departmet of Computer Sciece ad Automatio Idia Istitute of Sciece Bagalore, Idia July 01 Chapter 4: Domiat Strategy Equilibria Note: This is a oly a draft versio,

More information

Solutions to Problem Sheet 1

Solutions to Problem Sheet 1 Solutios to Problem Sheet ) Use Theorem.4 to prove that p log for all real x 3. This is a versio of Theorem.4 with the iteger N replaced by the real x. Hit Give x 3 let N = [x], the largest iteger x. The,

More information

SETTING GATES IN THE STOCHASTIC PROJECT SCHEDULING PROBLEM USING CROSS ENTROPY

SETTING GATES IN THE STOCHASTIC PROJECT SCHEDULING PROBLEM USING CROSS ENTROPY 19 th Iteratioal Coferece o Productio Research SETTING GATES IN THE STOCHASTIC PROJECT SCHEDULING PROBLEM USING CROSS ENTROPY I. Bedavid, B. Golay Faculty of Idustrial Egieerig ad Maagemet, Techio Israel

More information

Portfolio Optimization for Options

Portfolio Optimization for Options Portfolio Optimizatio for Optios Yaxiog Zeg 1, Diego Klabja 2 Abstract Optio portfolio optimizatio for Europea optios has already bee studied, but more challegig America optios have ot We propose approximate

More information

Maximum Empirical Likelihood Estimation (MELE)

Maximum Empirical Likelihood Estimation (MELE) Maximum Empirical Likelihood Estimatio (MELE Natha Smooha Abstract Estimatio of Stadard Liear Model - Maximum Empirical Likelihood Estimator: Combiatio of the idea of imum likelihood method of momets,

More information

Threshold Function for the Optimal Stopping of Arithmetic Ornstein-Uhlenbeck Process

Threshold Function for the Optimal Stopping of Arithmetic Ornstein-Uhlenbeck Process Proceedigs of the 2015 Iteratioal Coferece o Operatios Excellece ad Service Egieerig Orlado, Florida, USA, September 10-11, 2015 Threshold Fuctio for the Optimal Stoppig of Arithmetic Orstei-Uhlebeck Process

More information

Forecasting bad debt losses using clustering algorithms and Markov chains

Forecasting bad debt losses using clustering algorithms and Markov chains Forecastig bad debt losses usig clusterig algorithms ad Markov chais Robert J. Till Experia Ltd Lambert House Talbot Street Nottigham NG1 5HF {Robert.Till@uk.experia.com} Abstract Beig able to make accurate

More information

Calculation of the Annual Equivalent Rate (AER)

Calculation of the Annual Equivalent Rate (AER) Appedix to Code of Coduct for the Advertisig of Iterest Bearig Accouts. (31/1/0) Calculatio of the Aual Equivalet Rate (AER) a) The most geeral case of the calculatio is the rate of iterest which, if applied

More information

Online appendices from The xva Challenge by Jon Gregory. APPENDIX 10A: Exposure and swaption analogy.

Online appendices from The xva Challenge by Jon Gregory. APPENDIX 10A: Exposure and swaption analogy. APPENDIX 10A: Exposure ad swaptio aalogy. Sorese ad Bollier (1994), effectively calculate the CVA of a swap positio ad show this ca be writte as: CVA swap = LGD V swaptio (t; t i, T) PD(t i 1, t i ). i=1

More information

Introduction to Probability and Statistics Chapter 7

Introduction to Probability and Statistics Chapter 7 Itroductio to Probability ad Statistics Chapter 7 Ammar M. Sarha, asarha@mathstat.dal.ca Departmet of Mathematics ad Statistics, Dalhousie Uiversity Fall Semester 008 Chapter 7 Statistical Itervals Based

More information

Sampling Distributions and Estimation

Sampling Distributions and Estimation Cotets 40 Samplig Distributios ad Estimatio 40.1 Samplig Distributios 40. Iterval Estimatio for the Variace 13 Learig outcomes You will lear about the distributios which are created whe a populatio is

More information

We analyze the computational problem of estimating financial risk in a nested simulation. In this approach,

We analyze the computational problem of estimating financial risk in a nested simulation. In this approach, MANAGEMENT SCIENCE Vol. 57, No. 6, Jue 2011, pp. 1172 1194 iss 0025-1909 eiss 1526-5501 11 5706 1172 doi 10.1287/msc.1110.1330 2011 INFORMS Efficiet Risk Estimatio via Nested Sequetial Simulatio Mark Broadie

More information

1 + r. k=1. (1 + r) k = A r 1

1 + r. k=1. (1 + r) k = A r 1 Perpetual auity pays a fixed sum periodically forever. Suppose a amout A is paid at the ed of each period, ad suppose the per-period iterest rate is r. The the preset value of the perpetual auity is A

More information

SELECTING THE NUMBER OF CHANGE-POINTS IN SEGMENTED LINE REGRESSION

SELECTING THE NUMBER OF CHANGE-POINTS IN SEGMENTED LINE REGRESSION 1 SELECTING THE NUMBER OF CHANGE-POINTS IN SEGMENTED LINE REGRESSION Hyue-Ju Kim 1,, Bibig Yu 2, ad Eric J. Feuer 3 1 Syracuse Uiversity, 2 Natioal Istitute of Agig, ad 3 Natioal Cacer Istitute Supplemetary

More information

An Empirical Study of the Behaviour of the Sample Kurtosis in Samples from Symmetric Stable Distributions

An Empirical Study of the Behaviour of the Sample Kurtosis in Samples from Symmetric Stable Distributions A Empirical Study of the Behaviour of the Sample Kurtosis i Samples from Symmetric Stable Distributios J. Marti va Zyl Departmet of Actuarial Sciece ad Mathematical Statistics, Uiversity of the Free State,

More information

x satisfying all regularity conditions. Then

x satisfying all regularity conditions. Then AMS570.01 Practice Midterm Exam Sprig, 018 Name: ID: Sigature: Istructio: This is a close book exam. You are allowed oe-page 8x11 formula sheet (-sided). No cellphoe or calculator or computer is allowed.

More information

0.07. i PV Qa Q Q i n. Chapter 3, Section 2

0.07. i PV Qa Q Q i n. Chapter 3, Section 2 Chapter 3, Sectio 2 1. (S13HW) Calculate the preset value for a auity that pays 500 at the ed of each year for 20 years. You are give that the aual iterest rate is 7%. 20 1 v 1 1.07 PV Qa Q 500 5297.01

More information

Repeated Sales with Multiple Strategic Buyers

Repeated Sales with Multiple Strategic Buyers Repeated Sales with Multiple Strategic Buyers NICOLE IMMORLICA, Microsoft Research BRENDAN LUCIER, Microsoft Research EMMANOUIL POUNTOURAKIS, Northwester Uiversity SAMUEL TAGGART, Northwester Uiversity

More information

5 Statistical Inference

5 Statistical Inference 5 Statistical Iferece 5.1 Trasitio from Probability Theory to Statistical Iferece 1. We have ow more or less fiished the probability sectio of the course - we ow tur attetio to statistical iferece. I statistical

More information

The Communication Complexity of Coalition Formation among Autonomous Agents

The Communication Complexity of Coalition Formation among Autonomous Agents The Commuicatio Complexity of Coalitio Formatio amog Autoomous Agets Ariel D. Procaccia Jeffrey S. Roseschei School of Egieerig ad Computer Sciece The Hebrew Uiversity of Jerusalem Jerusalem, Israel {arielpro,

More information

ACTUARIAL RESEARCH CLEARING HOUSE 1990 VOL. 2 INTEREST, AMORTIZATION AND SIMPLICITY. by Thomas M. Zavist, A.S.A.

ACTUARIAL RESEARCH CLEARING HOUSE 1990 VOL. 2 INTEREST, AMORTIZATION AND SIMPLICITY. by Thomas M. Zavist, A.S.A. ACTUARIAL RESEARCH CLEARING HOUSE 1990 VOL. INTEREST, AMORTIZATION AND SIMPLICITY by Thomas M. Zavist, A.S.A. 37 Iterest m Amortizatio ad Simplicity Cosider simple iterest for a momet. Suppose you have

More information

1 The Power of Compounding

1 The Power of Compounding 1 The Power of Compoudig 1.1 Simple vs Compoud Iterest You deposit $1,000 i a bak that pays 5% iterest each year. At the ed of the year you will have eared $50. The bak seds you a check for $50 dollars.

More information

Chapter 5: Sequences and Series

Chapter 5: Sequences and Series Chapter 5: Sequeces ad Series 1. Sequeces 2. Arithmetic ad Geometric Sequeces 3. Summatio Notatio 4. Arithmetic Series 5. Geometric Series 6. Mortgage Paymets LESSON 1 SEQUENCES I Commo Core Algebra I,

More information

Topic-7. Large Sample Estimation

Topic-7. Large Sample Estimation Topic-7 Large Sample Estimatio TYPES OF INFERENCE Ò Estimatio: É Estimatig or predictig the value of the parameter É What is (are) the most likely values of m or p? Ò Hypothesis Testig: É Decidig about

More information

AY Term 2 Mock Examination

AY Term 2 Mock Examination AY 206-7 Term 2 Mock Examiatio Date / Start Time Course Group Istructor 24 March 207 / 2 PM to 3:00 PM QF302 Ivestmet ad Fiacial Data Aalysis G Christopher Tig INSTRUCTIONS TO STUDENTS. This mock examiatio

More information

Positivity Preserving Schemes for Black-Scholes Equation

Positivity Preserving Schemes for Black-Scholes Equation Research Joural of Fiace ad Accoutig IN -97 (Paper) IN -7 (Olie) Vol., No.7, 5 Positivity Preservig chemes for Black-choles Equatio Mohammad Mehdizadeh Khalsaraei (Correspodig author) Faculty of Mathematical

More information

APPLICATION OF GEOMETRIC SEQUENCES AND SERIES: COMPOUND INTEREST AND ANNUITIES

APPLICATION OF GEOMETRIC SEQUENCES AND SERIES: COMPOUND INTEREST AND ANNUITIES APPLICATION OF GEOMETRIC SEQUENCES AND SERIES: COMPOUND INTEREST AND ANNUITIES Example: Brado s Problem Brado, who is ow sixtee, would like to be a poker champio some day. At the age of twety-oe, he would

More information

Lecture 4: Parameter Estimation and Confidence Intervals. GENOME 560 Doug Fowler, GS

Lecture 4: Parameter Estimation and Confidence Intervals. GENOME 560 Doug Fowler, GS Lecture 4: Parameter Estimatio ad Cofidece Itervals GENOME 560 Doug Fowler, GS (dfowler@uw.edu) 1 Review: Probability Distributios Discrete: Biomial distributio Hypergeometric distributio Poisso distributio

More information

A Self-adaptive Predictive Policy for Pursuit-evasion Game

A Self-adaptive Predictive Policy for Pursuit-evasion Game JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 24, 397-407 (2008) A Self-adaptive Predictive Policy for Pursuit-evasio Game ZHEN LUO, QI-XIN CAO AND YAN-ZHENG ZHAO Research Istitute of Robotics Shaghai

More information

A Technical Description of the STARS Efficiency Rating System Calculation

A Technical Description of the STARS Efficiency Rating System Calculation A Techical Descriptio of the STARS Efficiecy Ratig System Calculatio The followig is a techical descriptio of the efficiecy ratig calculatio process used by the Office of Superitedet of Public Istructio

More information

Math 312, Intro. to Real Analysis: Homework #4 Solutions

Math 312, Intro. to Real Analysis: Homework #4 Solutions Math 3, Itro. to Real Aalysis: Homework #4 Solutios Stephe G. Simpso Moday, March, 009 The assigmet cosists of Exercises 0.6, 0.8, 0.0,.,.3,.6,.0,.,. i the Ross textbook. Each problem couts 0 poits. 0.6.

More information

Reduced Complexity Approaches to Asymmetric Information Games

Reduced Complexity Approaches to Asymmetric Information Games Reduced Complexity Approaches to Asymmetric Information Games Jeff Shamma and Lichun Li Georgia Institution of Technology ARO MURI Annual Review November 19, 2014 Research Thrust: Obtaining Actionable

More information

18.S096 Problem Set 5 Fall 2013 Volatility Modeling Due Date: 10/29/2013

18.S096 Problem Set 5 Fall 2013 Volatility Modeling Due Date: 10/29/2013 18.S096 Problem Set 5 Fall 2013 Volatility Modelig Due Date: 10/29/2013 1. Sample Estimators of Diffusio Process Volatility ad Drift Let {X t } be the price of a fiacial security that follows a geometric

More information

NOTES ON ESTIMATION AND CONFIDENCE INTERVALS. 1. Estimation

NOTES ON ESTIMATION AND CONFIDENCE INTERVALS. 1. Estimation NOTES ON ESTIMATION AND CONFIDENCE INTERVALS MICHAEL N. KATEHAKIS 1. Estimatio Estimatio is a brach of statistics that deals with estimatig the values of parameters of a uderlyig distributio based o observed/empirical

More information

MA Lesson 11 Section 1.3. Solving Applied Problems with Linear Equations of one Variable

MA Lesson 11 Section 1.3. Solving Applied Problems with Linear Equations of one Variable MA 15200 Lesso 11 Sectio 1. I Solvig Applied Problems with Liear Equatios of oe Variable 1. After readig the problem, let a variable represet the ukow (or oe of the ukows). Represet ay other ukow usig

More information

When you click on Unit V in your course, you will see a TO DO LIST to assist you in starting your course.

When you click on Unit V in your course, you will see a TO DO LIST to assist you in starting your course. UNIT V STUDY GUIDE Percet Notatio Course Learig Outcomes for Uit V Upo completio of this uit, studets should be able to: 1. Write three kids of otatio for a percet. 2. Covert betwee percet otatio ad decimal

More information