Topics on the Border of Economics and Computation November 6, Lecture 2

Topcs on the Border of Economcs and Computaton November 6, 2005 Lecturer: Noam Nsan Lecture 2 Scrbe: Arel Procacca 1 Introducton Last week we dscussed the bascs of zero-sum games n strategc form. We characterzed games as matrces, and assocated the utlt for the strateg profle (, ) wth entr a n the game matr. We also consdered med strateges (denoted b and ), and defned the utlt for the strateg profle (, ) as, a. 1 We proved that mn a mn a. (1) Moreover, we noted that an pure strateg n the support of a best-response med strateg s tself a best-response. Fnall, we formulated the row plaer s optmzaton problem as a mathematcal programmng problem: s.t. mn a = 1 : 0 We observed that ths s equvalent to the followng lnear programmng formulaton: c s.t. : a c = 1 : 0 (2) Smlarl, the column plaer faces the problem: 1 Under the mplct assumpton that the plaers are rsk-neutral. 2-1

mnd s.t. : a d = 1 : 0 (3) In ths lecture, we prove the Mn Theorem usng the LP Dualt Theorem. We net present Yao s prncple, a method to derve lower bounds for randomzed algorthms. Fnall, we defne games n etensve form, and analze the effectveness of the α β prunng algorthm and ts randomzed verson. 2 The Mn Theorem va LP In ths secton, we wsh to prove the followng theorem: Theorem 1 (Mn) For an zero-sum game t holds that: mn, a = mn a. (4), Remark The nequalt mn, a mn a, can be obtaned as a specal case of Equaton (1). Remark The two epressons n Equaton 4 are eactl the solutons of the LPs. Indeed, recall that a best-response med strateg has the same utlt as the pure strateges n ts support. Remark Gven a zero-sum game, mn, a (= mn, a ) s known as the value of the game. In order to prove the theorem, we wsh to transform the LPs (2) and (3) nto a more canoncal form. We present a new LP: 2-2

s.t. : a 1 : 0 (5) Lemma 2 The optmum of (5) s the nverse of the optmum of (2). Proof: Let = 1,..., n be a feasble soluton for (2) wth value c. The vector /c = 1 /c,..., n /c s a feasble soluton for (5), wth value 1/c. Therefore, the optmum of (5) s at least the nverse of the optmum of (2). In the other drecton, f s a feasble soluton for (5) wth value 1/c, then the vector c s a feasble soluton for (2), wth value c. Hence, t holds that the nverse of the optmum of (2) s at least the optmum of (5). Remark We can assume the optma are postve; otherwse, we ma transform the game nto a postve, equvalent game, b addng a constant to all entres n the game matr. Smetrcall, the optmum of the problem (3) s equal to the nverse of the optmum of the problem: mn s.t. : a 1 : 0 (6) The Mn Theorem now follows drectl from the lnear programmng Dualt Theorem [1, Theorem 29.10], as b the latter the optma of (5) and (6) are equal. Remark Randomzng the plaer s strateges has enabled us to abandon the settng where plaers stand to beneft b plang frst. In other words, b tossng cons onl after both plaers have decded on a strateg, we have created a stuaton n whch ratonal plaers secure the same utlt whether the pla frst or second. 3 Randomzed Algorthms as Games Yao [3] sought a method to derve lower bounds for randomzed algorthms. In ths secton we present hs applcaton of game theor to ths end. 2-3

For nputs of fed sze, we model algorthms performance as a zero-sum game n strategc form. Let A denote the set of all possble algorthms, and I be the set of all possble nputs. The rows are assocated wth elements of I, whle the columns correspond to elements of A. Entr a s the runnng tme of algorthm A on nput I. Computer-scentsts tradtonall adopt a worst-case approach to the analss of algorthms. The worst-case performance of an algorthm A s the outcome of the game where the column plaer chooses A, and the row plaer responds wth arg (a ). In ths respect, the row plaer s consdered to be a snster adversar. In other felds (ncludng econom), researchers assume there s some dstrbuton on the nput, and analze accordngl. However, n man cases ths dstrbuton s unustfed. For eample, the Qucksort algorthm acheves a runnng tme of O(n log n) for random nputs, but man real-world nputs are nearl sorted slowng the algorthm s performance to O(n 2 ). Ths approach s also consstent wth our model: the row plaer plas some strateg, and the column plaer responds wth the most sutable algorthm (n terms of epected runnng tme). For our purposes, a randomzed algorthm s a dstrbuton over determnstc algorthms. Denote the runnng tme of the algorthm A on nput I b R(A, I ); E[R(A, I )] = a ; and E[R(A, I )] = a. From the trval part of the Mn Theorem, we have: Theorem 3 (Yao s Prncple) For all dstrbutons over I and over A, mn E[R(A, I )] E[R(A, I)]. (7) A A I I Observe that the rght epresson n Equaton (7) s the worst-case runnng tme of a randomzed algorthm dstrbuted accordng to. Thus, n order to derve a lower bound on the runnng tme of an randomzed algorthm, t s enough to fnd some nast dsrbuton, wth respect to whch an determnstc algorthm performs poorl. Moreover, b the other drecton of the Mn Theorem, f one produces the ver worst dstrbuton, n whch the best algorthm runs longest, then ths runnng tme s equal to the worst-case epected runnng tme of the best randomzed algorthm. In ths sense, Yao s method s complete: t s possble to derve from the worst dstrbuton a lower bound whch s tght. Remark Yao s approach provdes a compromse between the clashng vews of computerscentsts and the rest of the world regardng runnng tme: the worst-case performance of the best randomzed algorthm versus determnstc nputs s equal to ts runnng tme on the worst possble randomzed nput. 2-4

(v) () (v) () () 4 Evaluaton of Game Trees Fgure 1: A zero-sum game n etensve form. Ths secton s nterestng n ts own rght, but wll ultmatel serve as an eample of the prncples llustrated n the prevous secton. Defnton 4 A zero-sum game n etensve-form conssts of: 1. A set N of plaers. 2. A tree T, called the game tree. 3. A partton of the nternal nodes among the plaers. 4. A utlt functon u assgnng a real number to ever leaf. Ths functon represents the utlt of plaer 1. See Fgure 1 for an llustraton of a game n etensve form (gnore the Roman numerals for now). At each stage n the game, the plaer who owns the current node chooses to proceed to a chld of ths node; n the net stage, the plaer who owns the chld makes a choce. The game ends when a leaf s reached; plaer 1 s utlt s the value of u on ths leaf. Remark A game n etensve form corresponds to a game n strategc form; a pure strateg n a game n etensve form specfes the plaer s acton at ever reachable node. 2-5

F a game tree and a partton of the nodes among the plaers. Consder the followng problem: we are gven the functon u (the utlt of the dfferent leaves), and are asked what the value of the game s. A smple soluton s the backward nducton algorthm: we propagate nodes values from the leaves, b takng the mum of the chldren s values n each node owned b plaer 1, and takng the mnmum n each node owned b plaer 2. Ths algorthm s rather effcent, as the runnng tme s lnear n the sze of the game tree. Nevertheless, n some settngs the tree s gargantuan. Is t necessar to quer all the leaves? Eample 1 Observe the game llustrated n Fgure 1. Assume the algorthm frst queres the chldren of the node marked b (); the value of ths node s 57. The algorthm then queres leaf (); there s no need to quer leaf (), as the value of node (v) s at least 68, and thus plaer 2 would surel go left at (v). A algorthm whch sometmes saves us some trouble s the α β-prunng algorthm. Here we gve a somewhat nformal descrpton of the algorthm; a full specfcaton can be found n [2, page 170]. From now on assume the game trees are bnar. Ma-Eval(T,α,β) 1: f T s a leaf then 2: return u(t) 3: end f 4: v (α,mn-eval(left(t), α, β)) 5: f v β then 6: return v 7: end f 8: v (v,mn-eval(rght(t), v, β)) 9: return v α represents a lower bound for values that can affect the value of the game, and β s an upper bound. The functon Mn-Eval s defned smetrcall. The game s value s gven b the eecuton of Ma-Eval on the root, wth α =, β =. At ths pont we add the assumpton that the utlt functon u s bnar-valued (the values of the leaves are ether 0 or 1). 2 Thus, the operaton s equvalent to the operaton, and mn can be replaced wth. We frst demonstrate that n the worst case, α β prunng doesn t prune an leaves. Lemma 5 Gven a game tree wth n leaves and an algorthm whch evaluates the game, there ests an nput such that the algorthm must quer all the leaves. Proof: The proof s b nducton on the number of leaves n, usng an adversar argument. The base case (n = 0) s trval. 2 The motvaton for ths assumpton s ease of eposton; the results can be generalzed. 2-6

Fgure 2: the tree T. Assume the clam holds for an tree wth n = k 1 leaves. Gven a tree wth k leaves, the algorthm queres some leaf frst. If the leaf s parent s an node, the adversar answers that the value s 0; and f the leaf s parent s an node, the adversar responds wth 1. Based on these answers, t s not possble to decde the value of the leaf s parent wthout querng the leaf s sblng. Therefore, the algorthm must stll evaluate a tree wth k 1 leaves. B the nducton assumpton, all the leaves of ths tree must be quered. It seems possble that randomzng the algorthm mght mprove the epected runnng tme. The randomzed verson smpl swtches the order of T s chldren wth probablt 1/2 (ths con flp s added between lnes 3 and 4 of Ma-Eval). We begn an analss of the effectveness of ths randomzed verson wth a modest specal case: the game tree T llustrated n Fgure 2. Lemma 6 Gven an values for the leaves, the epected number of queres randomzed α β prunng makes on T s at most 3. Proof: We eamne two cases. Case 1: the value of the tree s 1. It follows that the value of at least one of the nodes s 1; wthout loss of generalt the value of the left node s 1. Wth probablt at least 1/2 the algorthm frst computes the value of the left node. If ths happens, the value of the second node would not be calculated. Therefore, the epected runnng tme s at most: 1/2 2 + 1/2 4 = 3. Case 2: the value of the tree s 0. In ths settng, the value of both nodes s 0. For each one, wth probablt 1/2 the algorthm would frst quer the value of a leaf wth utlt 0, and ts sblng would not be quered. The epected runnng tme s at most 2(1/2 1 + 1/2 2) = 3. Net week we wll contnue ths lne of nqur. We wll show that f the nodes n an path from the root to the leaves alternate between the plaers, then randomzed α β prunng performes well; but the performance s not as good f plaer 1 makes all hs moves frst. 2-7

References [1] T. H. Cormen, C. E. Leserson, R. L. Rvest, and C. Sten. Introducton to Algorthms. MIT Press, 2nd edton, 2001. [2] S. Russel and P. Norvg. Artfcal Intellgence: A Modern Approach. Prentce Hall, 2nd edton, 2003. [3] A. C. Yao. Probablstc computatons: Towards a unfed measure of complet. In Proceedngs of the 17th Annual Smposum on Foundatons of Computer Scence (FOCS 77), pages 222 227, 1977. 2-8