Oracle inequalities for computationally budgeted model selection

Size: px
Start display at page:

Download "Oracle inequalities for computationally budgeted model selection"

Transcription

1 Oracle nequaltes for computatonally budgeted model selecton Alekh Agarwal John C. Duch Unversty of Calforna, Berkeley Peter L. Bartlett QUT and UC Berkeley Clement Levrard École Normale Supéreure Abstract We analyze general model selecton procedures usng penalzed emprcal loss mnmzaton under computatonal constrants. Whle classcal model selecton approaches do not consder computatonal aspects of performng model selecton, we argue that any practcal model selecton procedure must not only trade off estmaton and approxmaton error, but also the effects of the computatonal effort requred to compute emprcal mnmzers for dfferent functon classes. We provde a framework for analyzng such problems, and we gve algorthms for model selecton under a computatonal budget. These algorthms satsfy oracle nequaltes that show that the rsk of the selected model s not much worse than f we had devoted all of our computatonal budget to the best functon class. 1 Introducton In the standard statstcal predcton settng, one receves samples z 1,..., z n } Z drawn..d. from some unknown dstrbuton P over a sample space Z, and gven a loss functon l, seeks a functon f to mnmze the rsk Rf := E[lz, f]. 1 Snce Rf s unknown, the typcal approach s to approxmately mnmze the emprcal rsk, R n f := 1 n n =1 lz, f, over a functon class F. We seek a functon f n wth a rsk close to the Bayes rsk, the mnmal rsk over all measurable functons, whch s R 0 := nf f Rf. There s a natural tradeoff based on the class F one chooses, snce Rf n R 0 = Rf n nf f F Rf + nf Rf R 0, f F whch decomposes the excess rsk of f n nto estmaton error left and approxmaton error rght. A common approach to addressng ths tradeoff s to express F as a unon of classes F 1,..., F k. The model selecton problem s to choose a class F and a functon f F to gve the best tradeoff between estmaton error and approxmaton error. 1 A common approach to the model selecton problem s the now classcal dea of complexty regularzaton, whch arose out of early works by Mallows 1973 and Akake The complexty regularzaton approach balances two competng objectves: the mnmum emprcal rsk of a model class F approxmaton error and a complexty penalty to control estmaton error for the class. Dfferent choces of the complexty penalty gve rse to dfferent model selecton crtera and algorthms see e.g. Massart, 2003, and the references theren. Results of several authors e.g. Bartlett et al., 2002; Lugos and Wegkamp, 2004; Massart, 2003 show that gven a dataset of sze n, the output f n of the procedure roughly satsfes ER f n R 0 mn [ nf Rf R 0 + γ n f F ] 1 + O p n, 2 where γ n s a complexty penalty for class, whch s usually decreasng to zero n n and ncreasng n. Several approaches to complexty regularzaton are possble, and an ncomplete bblography 1 In general, the number of classes K can be nfnte, though we restrct attenton to fntely many classes for ths paper.

2 ncludes Vapnk and Chervonenks, 1974; Geman and Hwang, 1982; Rssanen, 1983; Barron, 1991; Bartlett et al., 2002; Lugos and Wegkamp, These oracle nequaltes show that, for a gven sample sze, the model selecton procedure gves the best trade-off between the approxmaton and estmaton errors. A drawback wth the above mentoned approaches s that we need to be able to optmze over each model n the herarchy on the entre data, n order to prove guarantees on the result of the model selecton procedure. Ths s natural when the sample sze s the key lmtaton, and t s computatonally feasble when the sample sze s small and the samples are low-dmensonal. However, the cost of tranng K dfferent model classes on the entre data sequence can be prohbtve when the datasets become large and hghdmensonal as s common n modern settngs. In these cases, t s computatonal resources rather than the sample sze that are the key constrant. In ths paper, we consder model selecton from ths computatonal perspectve, vewng the amount of computaton, rather than the sample sze, as the parameter whch wll enter our oracle nequaltes. Specfcally, we consder model selecton methods that work wthn a gven computatonal budget. An nterestng and dffcult aspect of the problem that we must address s the nteracton between model class complexty and computaton tme. It s natural to assume that for a fxed sample sze, t s more expensve to estmate a model from a complex class than a smple class. Put nversely, gven a computatonal bound, a smple model class can ft a model to a much larger sample sze than a rch model class. So any strategy for model selecton under a computatonal budget constrant should trade off two crtera: the relatve tranng cost of dfferent model classes, whch allows smpler classes to receve far more data thus makng them reslent to overfttng, and lower approxmaton error n the more complex model classes. In addressng these computatonal and statstcal ssues, ths paper makes three man contrbutons. Frst, we propose a novel computatonal perspectve on the model selecton problem, whch we beleve should be a natural consderaton n statstcal learnng problems. Secondly, wthn ths framework, we provde an algorthm explotng algorthms for mult-armed bandt problems that uses confdence bounds based on concentraton nequaltes to select a good model under a gven computatonal budget. We also prove a mnmax optmal oracle nequalty on the performance of the selected model. Our thrd man contrbuton s another algorthm based on a coarse-grd search, for model herarches that are structured by ncluson, that s, F 1 F 2... F K. Under natural assumptons regardng the growth of the complexty penaltes as we go to more complex classes, the coarse-grd search procedure satsfes better oracle nequaltes than the earler bandt algorthm. Both of our algorthms are computatonally smple and effcent. The remander of ths paper s organzed as follows. In the next secton, we formalze our settng and present the algorthms. Secton 3 presents our man results as well as some consequences for specfc problems and examples. We provde proofs n Sectons 4 through 5; Secton 4 contans the proof of the result for unstructured model selecton problems, whle Sec. 5 contans the proofs of oracle nequaltes for model selecton problems wth nested classes F. 2 Setup and algorthms In ths secton, we wll descrbe our statstcal and computatonal assumptons about the problem, gvng examples of classes of problems and statstcal procedures that satsfy the assumptons. We wll follow ths wth descrptons of our algorthms, ncludng ntutve explanatons of the procedures. 2.1 Setup and Goals Recall from the ntroducton that we have a collecton of K model classes F 1,..., F K. Let us begn by descrbng our computatonal assumptons. Frst, we assume as our basc unt of measure a computatonal quantum; wthn ths quantum, a model can be traned on any sngle class F usng n samples. That s, we assocate wth each class F a number of samples n N, where n s chosen so that tranng a model from class F on n examples requres the same amount of tme as tranng a model from class F j on n j samples. We assume an overall tme budget of T quanta, so that f we devote the entre computatonal budget to class, we could use T n samples to tran a model. 2 Our hgh level goal s to derve algorthms that perform nearly as well as f an oracle gave the best model class n advance, and we could devote the entre computatonal budget T to class. For the statstcal assumptons n our problem, we take an approach smlar to that of Bartlett et al. 2002, restrctng our attenton to complexty penaltes based on concentraton nequaltes. 2 The lnearty assumpton s essentally no loss of generalty. In addton, several algorthms satsfy t. We can work wth general non-lnear scalngs too, at the cost of sgnfcant notatonal burden whch we choose to avod here. 2

3 Each of our model selecton procedures uses a black-box algorthm A for fttng functons from the model class F to the data. We requre that these algorthms be statstcally well-behaved, n the sense that the emprcal rsk of A s output model f s near the true rsk of f. Recallng the defntons of R, R from the ntroducton, and defnng [K] = 1,..., K}, we state our man concentraton assumpton: Assumpton A. Let A, n F denote the output of algorthm A on a sample of n data ponts. a For each [K], there s a functon γ and constants κ 1, κ 2 > 0 such that for any n N, P R n A, n RA, n > γ n + κ 2 ɛ κ 1 exp 4nɛ 2. 3 b The output A, n s a γ n-mnmzer of R n, that s, c The functon γ n c n α for some α > 0. R n A, n nf Rn f γ n. f F d For any fxed functon f F, P R n f Rf > κ 2 ɛ κ 1 exp 4nɛ 2. There are many classes of functons and correspondng algorthms that satsfy Assumpton A. For one smple example, let F } be VC-classes of functons, where each F has VC-dmenson d, and l be the hnge loss, where lz, f = [1 yfx] +. Assumng that lz, f B for all f F, Dudley s entropy ntegral n ths case gves Dudley, 1978 d γ n = O and κ 1 2, κ 2 = OB. 4 n Smlar results hold for other convex losses and problems, for example regresson and densty estmaton problems wth squared or log losses. For functon classes of bounded complexty, such as VC, Sobolev, or Besov classes, penalty functons γ n can be computed that satsfy Assumpton A usng many technques; some relevant approaches nclude Rademacher and Gaussan complextes of the functon classes F, metrc entropy, Dudley s entropy ntegral, or localzaton technques e.g. Pollard, 1984; Bartlett and Mendelson, 2002; Dudley, In many concrete cases, such as parametrc models or VC classes, Assumpton Ac s satsfed wth α = 1 2. Our approach, smlar to the dea of complexty regularzaton, s to perform a knd of penalzed model selecton. If we knew the true rsk functonal R, we could mnmze a combnaton of the rsk and complexty penalty based on the number of samples our computatonal budget allows for the class. In partcular, gven penalty functons γ, we defne the best class n hndsght as } := argmn [K] nf Rf + γ T n f F. 5 The dea s that an algorthm performng model selecton whle takng nto account ts computatonal lmtatons should choose the best class consderng the total number of samples t could possbly have seen for the class. We note that ths s also closely related to the crteron 2 mnmzed n the absence of a computatonal budget, but n the classcal case t s assumed that each functon class can be evaluated on an dentcal and fxed number of samples n. 2.2 Upper-confdence bound algorthm wthout structure We now turn to outlnng the frst of the two man scenaros analyzed n ths paper. For now, we do not assume any structure relatng the collecton of model classes F 1,..., F K. The man dea of our algorthm n ths case s to ncrementally allocate our computatonal quota amongst the functon classes, where we trade off recevng samples for classes that have good rsk performance aganst explorng classes for whch we have receved few data ponts. We vew the budgeted model selecton problem as a repeated game wth T rounds. At teraton t, the procedure allocates one addtonal quantum of computaton to a to be specfed functon class. We assume that the computatonal complexty of fttng a model grows lnearly and ncrementally wth the number of samples, whch means that allocatng an addtonal quantum of tranng tme allows the black-box tranng algorthm A to process an addtonal n samples for class F. The lnear growth assumpton s satsfed, for nstance, when the loss functon l s convex and the black-box learnng algorthm A s a stochastc or onlne convex optmzaton procedure e.g. Cesa-Banch and Lugos, 2006; Nemrovsk et al.,

4 Algorthm 1 Mult-armed bandt algorthm for selecton of best class î. Foreach [K] query n examples from class F For t = K + 1 to T Set n t to be the number of examples seen for class at tme t n t Let t = argmn [K] Rj, n t Query n t examples for class t Output î, the ndex of the most frequently selected class. Usng our prevously defned notaton, we now defne the crteron we use n our procedure to select the class to whch we allocate a quantum. The optmstc selecton crteron for class, assumng that F has seen n samples at ths pont n the game, s R, n = R log K n A, n γ n + γ T n. 6 n The ntuton behnd the defnton of R, n s that we would lke the algorthm to choose functons f and classes that mnmze R n f+γ T n Rf+γ T n, but the negatve γ n and log K/n terms lower the crteron sgnfcantly when n s small and thus encourage ntal exploraton. The crteron 6 essentally combnes the penalzed model-selecton objectve used by Bartlett et al though we use a log K term, as we assume a fnte number of classes wth an optmstc crteron smlar to those used n mult-armed bandt algorthms Auer et al., Algorthm 1 contans our bandt procedure for model selecton. We run Alg. 1 for T rounds, where T s such that the entre computatonal budget s exhausted. Our results n Secton 3.1 show that Alg. 1 satsfes our twofold goals of selectng the class wth hgh probablty and outputtng a functon f wth good rsk performance. 2.3 Coarse-grd search algorthm for ncluson herarchy In practce, the classes F are rarely completely unrelated; perhaps the most common scenaro n model selecton s structural rsk mnmzaton, where the model classes F are subsets ordered n ncreasng complexty of a larger model space. To that end, our second man scenaro nvolves studyng computatonally constraned model selecton procedures under the followng assumpton. Assumpton B. The functon classes F satsfy an ncluson herarchy: F 1 F 2 F K 7 One smple example satsfyng above assumpton s classes of functons of the form x fx = θ, x, where each functon class F s dentfed wth an ncreasng bound on θ. A second smple famly of examples conssts of scenaros n whch f F s of the form x fx = θ, φ x where φ s a feature mappng of the nput data Z and φ s a projecton of φ +1. For example, functons n class + 1 observe more features than those n class or the dfferent classes F may consst of an ncreasng sequence of wavelet bases. Intutvely, we expect the structure assumed above to help our model selecton procedure because the mnmum expected rsks of dfferent functon classes are no longer ndependent of each other. It s easy to see that under our assumpton, R R j for j. 8 Clearly, under Assumpton B, the penaltes can always be chosen to be ncreasng as a functon of the class complexty: γ n γ j n for j. 9 Snce our approach nvolves gvng a dfferent number of samples to each class, we requre a slghtly stronger orderng than the above equaton. We assume that for any budget T, we have γ T n γ j T n j for j. 10 Ths assumpton s reasonable snce we expect that γ n s a decreasng functon of n and n n j for j, so that γ T n γ T n j γ j T n j. We now show a smple grd-search based algorthm that gves oracle nequaltes dependng only logarthmcally on the number of classes for ths ncluson herarchy under natural condtons on the 4

5 growth of the complexty penaltes as a functon of the class ndex. The method takes nspraton from the naïve strategy that splts the budget T unformly across the K classes and fnds the class wth the smallest penalzed emprcal rsk, usng T n /K samples for class. Of course, the naïve approach has the drawback that the computatonal budget avalable to each class s reduced by a factor of K, whch yelds very poor scalng wth the number K of classes. The key observaton we explot s that under the nestng structure 7, we do not need to fnd the smallest regularzed emprcal rsk for each class. We can nstead pck a small subset S of classes and perform model selecton only over the classes n S, then use the ncluson assumpton B to reason about the classes not n S for approprate choces of S. Wth ths ntuton, we now defne a good choce for S: Defnton 1 Coarse grd. For a set S [K], we say that S satsfes the coarse grd condtons wth parameters s N and λ > 0 f S = s and for each [K] there s an ndex j S such that T n T nj T n γ γ j 1 + λγ. 11 s s s We defne to be the sze of the smallest set S satsfyng condton 11, notng that K. In general, for a gven λ there may be no small set S satsfyng Defnton 1; however, we are nterested n settngs where a set S of sze = Olog K exsts. Example 1. Let F } be an ncreasng collecton of VC-classes, say f F s of the form x fx = θ, φ x where φ s a d -dmensonal mappng and F has VC-dmenson d. In ths case, recallng the VC-bound 4, we know that up to constant factors γ n d /n. Makng the reasonable assumpton that tranng tme s lnearly dependent on the VC dmenson, we have n = n K d K /d for [K], so T nk d K γ T n = γ d d 2 1 = d. T n K d K T nk d K Example 1 s suggestve of a pattern common to many herarches of functon classes ncludng parametrc and VC-classes wth ndexng VC-dmenson where the penalty functons nteract wth the sample szes n so that γ splts naturally nto a product γ T n = gt h for some functons g and h whch may depend on K. For such cases, the condton 11 reduces to ensurng T T T g h g hj 1 + λg h, whch amounts to showng h hj 1 + λh ndependent of the settng of snce h s non-ncreasng, we need only show the latter nequalty. Let S = j 1,..., j }. We construct S by settng j = K and recursvely defnng j to be the smallest ndex j < j +1 such that hj λhj. Then the number of classes can be bounded by usng the relaton hk = hj 1 + λhj λ h1, so that so long as loghk/h1 log1+λ, we can choose a set S satsfyng condton 11 wth S =. In partcular, s logarthmc n K as long as the functon h grows sub-exponentally. Other natural examples of functon classes satsfyng such growth condtons nclude Besov or Sobolev functon classes nested by degree or smoothness as well as wavelet bases. We refer the reader to the work of Barron et al for a compendum of results where γ T n = gt h. Gven the above, our algorthm has a smple descrpton. We fx a desred accuracy λ and fnd the smallest set S satsfyng Defnton 1. We then pck the class î satsfyng î argmn S RT n/ + γ T n / }, 12 where S =. We observe that the penalty functons are typcally known n closed form wth the excepton of data-dependent complexty penaltes, and hence computaton of the set S can be effcent and s generally much cheaper than tranng the models. In Secton 3.2, we gve an oracle nequalty on the performance of the estmate î from the procedure 12 that has only mld dependence on the number of classes so long as does not grow too fast wth K. 5

6 3 Man results and ther consequences In ths secton, we come to the descrpton of the performance guarantees for Algorthms 1 and 12. To buld ntuton, we also specalze the theorems to specfc statstcal problems and model classes. 3.1 Oracle nequaltes for unstructured model classes In ths secton we gve performance guarantees on the class pcked by Algorthm 1. We defne the excess penalzed rsk := R + γ T n R γ T n Essentally wthout loss of generalty, we assume that the nfmum n the equaton R = nf f F Rf s attaned by a functon f. If the nfmum s not attaned we smply choose some fxed f such that Rf nf f F Rf + δ for an arbtrarly small δ > 0. We frst perform analyss under the assumpton that > 0 strctly for, but we wll then relax to allow non-unque. The gans of a computatonally adaptve strategy over naïve strateges are best seen when the gap 13 s non-zero. Under ths assumpton, we can follow the deas of Auer et al and show that the fracton of the computatonal budget allocated to any suboptmal class goes quckly to zero as T grows. We provde the proof of the followng theorem n Secton 4. Theorem 1. Let Alg. 1 be run for T rounds and T T be the number of tmes class s quered. Let be defned as n 13, the condtons of Assumpton A hold, and assume that T K. Defne β = max1/α, 2}. There s a constant C such that E[T T ] C n c + κ 2 β and P T T > C n c + κ 2 β κ 1 T K 4, where c and α come from the defnton of the concentraton functon γ n Assumpton Ac. At a hgh level, ths result shows that the fracton of budget allocated to any suboptmal class log 1 goes to 0 at the rate T β. n T Hence, asymptotcally n T, we wll receve exponentally more samples for than any other class and wll perform almost as well as f we had known n advance. To see an example of concrete rates that can be concluded from the above result, let F 1,..., F K be model classes wth fnte VC-dmenson, 3 so that Assumpton A s satsfed wth α = 1 2. Then we have Corollary 1. Under the condtons of Theorem 1, assume F 1,..., F K are model classes of fnte VC-dmenson, where F has dmenson d. Then there s a constant C such that E[T T ] C maxd, κ 2 2 } 2 n and P T T > C maxd, κ 2 2 } 2 n κ 1 T K 4. The result of Corollary 1 s nearly optmal n general due to a lower bound for the specal case of mult-armed bandt problems La and Robbns, To see the connecton, let F correspond to the th arm n a mult-armed bandt problem and the rsk R be the expected reward of arm. In ths case, the complexty penalty γ for each class s 0. La and Robbns gve a lower bound that shows that the expected number of pulls of any suboptmal arm s at least E[T T ] = Ω KLp p, where p and p are the reward dstrbutons for the th and optmal arms, respectvely. Unfortunately, the condton that > 0 may not always be satsfed, or may be so small as to render the bound n Theorem 1 vacuous. Nevertheless, we ntutvely beleve that our algorthm can quckly fnd a small set of good classes those wth small penalzed rsk and spend ts computatonal budget to try to dstngush amongst them. In ths case, though, Algorthm 1 wll not vst suboptmal classes and so can stll output a functon f satsfyng good oracle bounds. In order to prove a result quantfyng ths ntuton, we frst upper bound the regret of Algorthm 1, that s, the average excess rsk suffered by the algorthm over all teratons, and then show how to use ths bound for obtanng a model wth a small rsk. We state our results for the case where α α and defne β = max1/α, 2}. Proposton 1. Use the same assumptons as Theorem 1, but further assume that α α for all. Wth probablty at least 1 κ 1 /T K 3, the regret average excess rsk of Algorthm 1 s bounded as K K T T 2eT C 1 1/β c + κ 2 β =1 =1 n 1/β 3 Smlar corollares hold for any model class whose metrc entropy grows as polylog 1 ɛ. 6

7 for a constant C dependent on α. In order to obtan a model wth a small rsk, we need to make an addtonal assumpton that the models are compatble n the sense that one can defne the addton operator f +g for f F, g F j meanngfully. We also assume that the rsk functonal Rf s convex n f. In such a settng, we can average the functons mnmzng the objectve R, n, that s, f t = argmn f Ft Rntf, to obtan a functon satsfyng the desred oracle nequalty. For ths theorem, we also assume that the constants c from Assumpton Ac satsfy c = O. Theorem 2. Use the same assumptons as Proposton 1. Let f t be the functon chosen by algorthm A at round t of Alg. 1 and defne the average functon f T = 1 T T t=1 f t. If the rsk functonal R s convex, there are constants C, C dependent on α such that wth probablty greater than 1 2κ 2 /T K 3, K 1/β R f T R + γ T n + 2eκ 2 T β C + C T 1/β K =1 n =1 [ c n α + κ 2 n 1 2 log K + κ2 n ] 1/β 1 β 2. Let us nterpret the above bound and dscuss ts optmalty. When α = 1 2 e.g., for VC classes, we have β = 2; moreover, t s clear that K =1 C n = OK. Thus, to wthn constant factors, we have R f K max, log K} T = R + γ T n + O. T Ignorng logarthmc factors, the above bound s mnmax optmal, whch follows by a reducton of our model selecton problem to the specal case of a mult-armed bandt problem. In ths case, Theorem 5.1 of Auer et al shows that for any set of K, T values, there s a dstrbuton over the rewards of arms whch forces Ω KT regret, that s, the average excess rsk of the classes chosen by Alg. 1 must be Ω KT. We provde proofs of Proposton 1 and Theorem 2 n the long verson of the paper. 3.2 Oracle nequaltes for nested herarches In ths secton we provde an oracle nequalty on the output of the procedure 12 that has a more favorable dependence on the number of classes K than our bounds for unstructured functon classes F. The man dea s to use Assumpton B along wth Defnton 1 to show that performng a coarse grd search over S s suffcent to deduce an oracle nequalty over the entre herarchy. The next theorem provdes an oracle nequalty for the rsk of the functon f = Aî, nît/, whch s the output of the learnng algorthm A appled to the class î pcked by our algorthm. Theorem 3. Let f = Aî, nît/ be the output of the algorthm A for class î specfed by the procedure 12. Let Assumptons A B hold. Wth probablty at least 1 3κ 1 exp 4m } T Rf mn R n log K m λγ + κ 2 + κ 2. [K] 2T n K T n K Remark: It s possble to reduce the κ 2 log K/2T nk term n the bound above to a 1 + λ log /2T n term appearng nsde the mnmum over classes [K] by requrng the coarse grd condton 11 to hold over terms of the form γ T n / + log /2T n. Ths stronger bound apples, for example, to sequences of VC-classes as descrbed n Example 1. The above result makes t clear that the excess rsk of the algorthm outsde of the mnmum over all the classes scales as OT 1/2. It s of nterest to contrast Theorem 3 wth results of the prevous secton. In the completely general case, we have a dependence on K better than K only when there s constant separaton between the penalzed rsks of dfferent classes. Snce K, the result of the above theorem s essentally as strong as any of the results from the prevous secton, as we would hope when we know F F +1. Nonetheless, the man strength of Theorem 3 s n scenaros where = Olog K, such as VCclasses e.g. Example 1 wth at most polynomal growth n VC-dmenson. In such scenaros, the 7

8 functon f that the procedure outputs s compettve up to logarthmc factors wth an oracle that devotes the entre computaton budget to the optmal class. We note that model selecton procedures suffer a penalty of log K or log even n computatonally unconstraned settngs see, e.g., Bartlett et al., 2002, so our computatonally restrcted procedure suffers at most an addtonal penalty of O log K. We conclude by recallng that many common model selecton scenaros satsfy = Olog K, as noted n Secton Proof of Theorem 1 At a hgh level, the proof of ths theorem nvolves combnng the technques for analyss of multarmed bandts developed by Auer et al wth Assumpton A. We start by gvng a lemma whch wll be useful to prove the theorem. The lemma states that after a suffcent number of ntal teratons τ, the probablty Algorthm 1 chooses to receve samples for a sub-optmal functon class s extremely small. Recall also our notatonal conventon that β = max1/α, 2}. Lemma 1. For any class, any s [1, T ] and s [τ, T ] where τ > 0 satsfes under Assumpton A we have P τ > 2β c + κ 2 + κ2 log K β, n β R, n s κ 2 n s R, n s κ 2 n s 2κ 1 T K 4. We defer the proof of the lemma to Appendx A, though at a hgh level the proof works as follows. The bad event n Lemma 1, that s, Algorthm 1 selects a sub-optmal class, occurs only f one of the followng three errors occurs: the emprcal rsk of class s much lower than ts true rsk, the emprcal rsk of class s hgher than ts true rsk, or s s not large enough to actually separate the true penalzed rsks from one another. Under the assumptons of the lemma, however, coupled wth the unform convergence propertes n Assumpton A, each of these three sub-events s qute unlkely. Now we turn to the proof of Theorem 1 assumng the lemma. Let t denote the model class ndex chosen by Algorthm 1 at tme t, and let s t denote the number of tmes class has been selected at round t of the algorthm. When no tme ndex s needed, s wll denote the same thng. Note that f t = and the number of tmes class s quered exceeds τ > 0, then by the defnton of the selecton crteron 6 and choce of t n Alg. 1, for some s τ,..., t 1} and s 1,..., t 1} we have R, n s κ 2 n s R, n s κ 2. n s Here we nterpret R, n s to mean a random realzaton of the observed rsk consstent wth the samples we observe. Usng the above mplcaton, we thus have T T T n = 1 + I t = τ + I t =, T t 1 τ τ + τ + t=k+1 T t=k+1 T I t 1 t=1 s =1 s =τ t=k+1 mn R, n s κ 2 τ s <t n s t 1 I R, n s κ 2 n s R, n s κ 2 To control the last term, we nvoke Lemma 1 and obtan that τ > 2β c + κ 2 + κ2 log K β T t 1 E[T n] τ + n β max 0<s<t R, n s κ 2 n s. 14 n s t 1 t=1 s=1 s =τ κ 1 2 T K 4 τ + κ 1 T K 4. Hence for any suboptmal class, E[T n] τ + κ 1 /T K 4, where τ satsfes the lower bound of Lemma 1 and s thus logarthmc n T. Under the assumpton that T K, for, E[T T ] C c + κ 2 max1/α,2} n max1/α,2} 8 15

9 for a constant C 2 4 max1/α,2}. Now we prove the hgh-probablty bound. For ths part, we need only concern ourselves wth the sum of ndcators from 14. Markov s nequalty shows that T P I t =, T t 1 τ 1 κ 1 T K 4. t=k+1 Thus we can assert that the bound 15 on T T holds wth hgh probablty. Remark: By examnng the proof of Theorem 1, t s straghtforward to see that f we modfy the multplers on the terms n the crteron 6 by mκ 2 nstead of κ 2, we get that the probablty bound s of the order T 3 4m2 K 4m2, whle the bound on T T s scaled by m 1/α. 5 Model selecton over nested herarches In ths secton, we prove Theorem 3. The followng proposton states that the class returned by the output of the procedure 12 satsfes an oracle nequalty over the set S. Proposton 2. Let f = Aî, nît/ be the output of the algorthm A for class î specfed n Equaton 12. Under the condtons of Theorem 3, wth probablty at least 1 3κ 1 exp 4m T n log Rf mn S R + γ + κ 2 2T n + κ m 2. T n K The proof of the proposton follows from an argument smlar to that gven by Bartlett et al We present a proof at the end of ths secton, snce our settng s slghtly dfferent: each class receves a dfferent number of ndependent samples. Frst, however, we complete the proof of Theorem 3 usng the proposton. Proof of Theorem 3 Let [K] be any class not necessarly n S, and let j S be the smallest class satsfyng j. Then by constructon of S, we know that T n T nj T n γ γ j 1 + λγ. Thus we can lower bound the penalzed rsk of class as T R n λγ R j + 2γ j T nj where we used the nestng assumpton B to conclude that j mples Rj R. Now combnng the above lower bound wth the nequalty n Proposton 2 yelds that wth probablty at least 1 3κ 1 exp m Rf mn S snce K and n n K. R + γ T n + κ 2, log 2T n + m T n K } T mn R n log K λγ + κ 2 + κ 2 [K] 2T n K m T n K Proof of Proposton 2 To prove the proposton, we would lke to control the probablty T n log P Rf > mn S R + 2γ + κ 2 2T n + ɛ [ } ] T n P Rf > mn R T S n/ + γ + ɛ/2 16 }} T 1 } T n T n log + P mn R T S n/ + γ > mn S R + 2γ + κ 2 2T n + ɛ/2 } } T 2 9

10 where the nequalty follows from a unon bound. We now bound the terms T 1 and T 2 separately. To bound the terms, we frst observe that by the constructon 12, the mnmum over the penalzed emprcal rsk s attaned for the class î. We thus smplfy T 1 as P [ Rf > mn S R T n/ + γ T n } ] [ T nî + ɛ/2 = P Rf > Rf + γî κ 1 exp T n î ɛ2 κ 2 2, } ] + ɛ/2 where the nequalty follows by applcaton of Assumpton Aa. To bound T 2 n the sum 16, we defne f = argmn f F Rf so that R = Rf. Notng that the event n T 2 mples that max S R T T n/ R n log γ κ 2 2T n > ɛ 2, we can use the unon bound to see T 2 P sup S R T n/ R γ P T RT n/ R n γ S P Rf R > ɛ 2 + κ 2 S T n κ 2 log 2T n log 2T n κ 2 log, > ɛ 2 > ɛ 2T n 2 where the fnal nequalty uses Assumpton Ab, whch states that A outputs a γ -mnmzer of the emprcal rsk. Now we can bound the devatons usng Assumpton Ad, snce f s non-random: T 2 κ 1 exp T n ɛ 2 κ 2 exp 2 log. S 2 m Settng ɛ = κ 2 T n K, see that the frst term n boundng T 2 reduces to exp mn /n K exp m snce n n K. Then we get T 2 S κ 1 exp m exp 2 log 2κ 1 exp m, where the last step uses =1 1/2 = π 2 /6 2. Fnally, pluggng the stated settng of ɛ nto the bound on T 1 completes the proof. Acknowledgements In performng ths research, AA was supported by a Mcrosoft Research Fellowshp, and JCD was supported by the Natonal Defense Scence and Engneerng Graduate Fellowshp NDSEG Program. AA and PB gratefully acknowledge the support of the NSF under award DMS References H. Akake. A new look at the statstcal model dentfcaton. IEEE Transactons on Automatc Control, 196: , December P. Auer, N. Cesa-Banch, Y. Freund, and R. E. Schapre. The nonstochastc multarmed bandt problem. SIAM J. Comput., 321:48 77, Peter Auer, Ncolò Cesa-Banch, and Paul Fscher. Fnte-tme analyss of the multarmed bandt problem. Mach. Learn., 472-3: , ISSN

11 A. R. Barron. Complexty regularzaton wth applcaton to artfcal neural networks. In Nonparametrc functonal estmaton and related topcs, pages Kluwer Academc, Andrew Barron, Lucen Brgé, and Pascal Massart. Rsk bounds for model selecton va penalzaton. Probablty Theory and Related Felds, 113: , P. L. Bartlett and S. Mendelson. Rademacher and Gaussan complextes: Rsk bounds and structural results. Journal of Machne Learnng Research, 3: , P. L. Bartlett, S. Boucheron, and G. Lugos. Model selecton and error estmaton. Machne Learnng, 48:85 113, N. Cesa-Banch and G. Lugos. Predcton, Learnng, and Games. Cambrdge Unversty Press, R. M. Dudley. The szes of compact subsets of Hlbert space and contnuty of Gaussan processes. Journal of Functonal Analyss, 1: , R. M. Dudley. Central lmt theorems for emprcal measures. The Annals of Probablty, 66: , S. Geman and C. R. Hwang. Nonparametrc maxmum lkelhood estmaton by the method of seves. Annals of Statstcs, 10: , T. L. La and Herbert Robbns. Asymptotcally effcent adaptve allocaton rules. Advances n Appled Mathematcs, 6:4 22, G. Lugos and M. Wegkamp. Complexty regularzaton va localzed random penaltes. Annals of Statstcs, 324: , C. L. Mallows. Some comments on C p. Technometrcs, 154: , P. Massart. Concentraton nequaltes and model selecton. In J. Pcard, edtor, Ecole d Et de Probablts de Sant-Flour XXXIII Seres. Sprnger, A. Nemrovsk, A. Judtsky, G. Lan, and A. Shapro. Robust stochastc approxmaton approach to stochastc programmng. SIAM Journal on Optmzaton, 194: , Davd Pollard. Convergence of Stochastc Processes. Sprnger-Verlag, Jorma Rssanen. A unversal pror for ntegers and estmaton by mnmum descrpton length. The Annals of Statstcs, 112: , V. N. Vapnk and A. Ya. Chervonenks. Theory of pattern recognton. Nauka, Moscow, In Russan. 11

12 A Proof of Lemma 1 Followng Auer et al. 2002, we show that the event n the lemma occurs wth very low probablty by breakng t up nto smaller events more amenable to analyss. Recall that we re nterested n controllng the probablty of the event R, n s κ 2 R, n s κ 2 17 n s n s For ths bad event to happen, at least one of the followng three events must happen: R ns A, n s nf f F Rf γ n s κ 2 log K κ 2 n s n s 18a log K R n s A, n s nf Rf γ n s + κ 2 + κ 2 18b f F n s n s log K R + γ T n R + γ T n + 2 γ n s + κ 2 + κ 2. 18c n s n s Temporarly use the shorthand f = A, n s and f = A, n s. The relatonshp between Eqs. 18a 18c and the event n 17 follows from the fact that f none of 18a 18c occur, then R, n s κ 2 n s log K = Rns f + γ T n γ n s κ 2 κ 2 n s n s 18a log K log t > nf Rf + γ T n 2 γ n s + κ 2 + κ 2 f F n s n s 18c log K > nf Rf + γ T n + 2 γ n s + κ 2 + κ 2 f F n s n s log K log n 2 γ n s + κ 2 + κ 2 n s n s 18b log K log t > Rn s f + γ T n γ n s κ 2 κ 2 n s n s log t = R, n s κ 2. n s From the above strng of nequaltes, to show that the event 17 has low probablty, we need smply show that each of 18a, 18b, and 18c have low probablty. To prove that each of the bad events have low probablty, we note the followng consequences of Assumpton A. Recall the defnton of f as the mnmzer of Rf over the class F. Then by Assumpton Aa, Rf γ n κ 2 ɛ RA, n γ n κ 2 ɛ < R n A, n, whle Assumptons Ab and Ad mply R n A, n R n f + γ n Rf + γ n + κ 2 ɛ, each wth probablty at least 1 κ 1 exp 4nɛ 2. In partcular, we see that the events 18a and 18b have low probablty: [ ] log K P R ns A, n s Rf γ n s κ 2 P [ log K κ 1 exp 4n s + log t = κ 1 n s n s tk 4 n s κ 2 log K R n s A, n s R γ n s + κ 2 n s log K κ 1 exp 4n s + = κ 1 n s n s tk 4. n s ] + κ 2 n s 12

13 What remans s to show that for large enough τ, 18c does not happen. Recallng the defnton that R + γ T n = R + γ T n, we see that for 18c to fal t s suffcent that > 2γ τn + 2κ 2 log K n τ + 2κ 2 n τ. Let x y := mnx, y} and x y := maxx, y}. Snce γ n c n α, the above s satsfed when 2 > c τn α κ 2 log Kτn α κ 2 τn α We can solve 19 above and see mmedately that f then τ > 21/α 2 c + κ 2 + κ2 log K 1/α 2, n 1/α 2 R > R + 2 log K γ n τ + κ 2 + κ 2 n τ n τ Thus the event n 18c fals to occur, completng the proof of the lemma

Scribe: Chris Berlind Date: Feb 1, 2010

Scribe: Chris Berlind Date: Feb 1, 2010 CS/CNS/EE 253: Advanced Topcs n Machne Learnng Topc: Dealng wth Partal Feedback #2 Lecturer: Danel Golovn Scrbe: Chrs Berlnd Date: Feb 1, 2010 8.1 Revew In the prevous lecture we began lookng at algorthms

More information

15-451/651: Design & Analysis of Algorithms January 22, 2019 Lecture #3: Amortized Analysis last changed: January 18, 2019

15-451/651: Design & Analysis of Algorithms January 22, 2019 Lecture #3: Amortized Analysis last changed: January 18, 2019 5-45/65: Desgn & Analyss of Algorthms January, 09 Lecture #3: Amortzed Analyss last changed: January 8, 09 Introducton In ths lecture we dscuss a useful form of analyss, called amortzed analyss, for problems

More information

MgtOp 215 Chapter 13 Dr. Ahn

MgtOp 215 Chapter 13 Dr. Ahn MgtOp 5 Chapter 3 Dr Ahn Consder two random varables X and Y wth,,, In order to study the relatonshp between the two random varables, we need a numercal measure that descrbes the relatonshp The covarance

More information

Equilibrium in Prediction Markets with Buyers and Sellers

Equilibrium in Prediction Markets with Buyers and Sellers Equlbrum n Predcton Markets wth Buyers and Sellers Shpra Agrawal Nmrod Megddo Benamn Armbruster Abstract Predcton markets wth buyers and sellers of contracts on multple outcomes are shown to have unque

More information

2.1 Rademacher Calculus... 3

2.1 Rademacher Calculus... 3 COS 598E: Unsupervsed Learnng Week 2 Lecturer: Elad Hazan Scrbe: Kran Vodrahall Contents 1 Introducton 1 2 Non-generatve pproach 1 2.1 Rademacher Calculus............................... 3 3 Spectral utoencoders

More information

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #21 Scribe: Lawrence Diao April 23, 2013

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #21 Scribe: Lawrence Diao April 23, 2013 COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #21 Scrbe: Lawrence Dao Aprl 23, 2013 1 On-Lne Log Loss To recap the end of the last lecture, we have the followng on-lne problem wth N

More information

Appendix - Normally Distributed Admissible Choices are Optimal

Appendix - Normally Distributed Admissible Choices are Optimal Appendx - Normally Dstrbuted Admssble Choces are Optmal James N. Bodurtha, Jr. McDonough School of Busness Georgetown Unversty and Q Shen Stafford Partners Aprl 994 latest revson September 00 Abstract

More information

Maximum Likelihood Estimation of Isotonic Normal Means with Unknown Variances*

Maximum Likelihood Estimation of Isotonic Normal Means with Unknown Variances* Journal of Multvarate Analyss 64, 183195 (1998) Artcle No. MV971717 Maxmum Lelhood Estmaton of Isotonc Normal Means wth Unnown Varances* Nng-Zhong Sh and Hua Jang Northeast Normal Unversty, Changchun,Chna

More information

Economic Design of Short-Run CSP-1 Plan Under Linear Inspection Cost

Economic Design of Short-Run CSP-1 Plan Under Linear Inspection Cost Tamkang Journal of Scence and Engneerng, Vol. 9, No 1, pp. 19 23 (2006) 19 Economc Desgn of Short-Run CSP-1 Plan Under Lnear Inspecton Cost Chung-Ho Chen 1 * and Chao-Yu Chou 2 1 Department of Industral

More information

Tests for Two Correlations

Tests for Two Correlations PASS Sample Sze Software Chapter 805 Tests for Two Correlatons Introducton The correlaton coeffcent (or correlaton), ρ, s a popular parameter for descrbng the strength of the assocaton between two varables.

More information

Appendix for Solving Asset Pricing Models when the Price-Dividend Function is Analytic

Appendix for Solving Asset Pricing Models when the Price-Dividend Function is Analytic Appendx for Solvng Asset Prcng Models when the Prce-Dvdend Functon s Analytc Ovdu L. Caln Yu Chen Thomas F. Cosmano and Alex A. Hmonas January 3, 5 Ths appendx provdes proofs of some results stated n our

More information

Applications of Myerson s Lemma

Applications of Myerson s Lemma Applcatons of Myerson s Lemma Professor Greenwald 28-2-7 We apply Myerson s lemma to solve the sngle-good aucton, and the generalzaton n whch there are k dentcal copes of the good. Our objectve s welfare

More information

ECE 586GT: Problem Set 2: Problems and Solutions Uniqueness of Nash equilibria, zero sum games, evolutionary dynamics

ECE 586GT: Problem Set 2: Problems and Solutions Uniqueness of Nash equilibria, zero sum games, evolutionary dynamics Unversty of Illnos Fall 08 ECE 586GT: Problem Set : Problems and Solutons Unqueness of Nash equlbra, zero sum games, evolutonary dynamcs Due: Tuesday, Sept. 5, at begnnng of class Readng: Course notes,

More information

OPERATIONS RESEARCH. Game Theory

OPERATIONS RESEARCH. Game Theory OPERATIONS RESEARCH Chapter 2 Game Theory Prof. Bbhas C. Gr Department of Mathematcs Jadavpur Unversty Kolkata, Inda Emal: bcgr.umath@gmal.com 1.0 Introducton Game theory was developed for decson makng

More information

Price and Quantity Competition Revisited. Abstract

Price and Quantity Competition Revisited. Abstract rce and uantty Competton Revsted X. Henry Wang Unversty of Mssour - Columba Abstract By enlargng the parameter space orgnally consdered by Sngh and Vves (984 to allow for a wder range of cost asymmetry,

More information

A Set of new Stochastic Trend Models

A Set of new Stochastic Trend Models A Set of new Stochastc Trend Models Johannes Schupp Longevty 13, Tape, 21 th -22 th September 2017 www.fa-ulm.de Introducton Uncertanty about the evoluton of mortalty Measure longevty rsk n penson or annuty

More information

Parallel Prefix addition

Parallel Prefix addition Marcelo Kryger Sudent ID 015629850 Parallel Prefx addton The parallel prefx adder presented next, performs the addton of two bnary numbers n tme of complexty O(log n) and lnear cost O(n). Lets notce the

More information

Multifactor Term Structure Models

Multifactor Term Structure Models 1 Multfactor Term Structure Models A. Lmtatons of One-Factor Models 1. Returns on bonds of all maturtes are perfectly correlated. 2. Term structure (and prces of every other dervatves) are unquely determned

More information

Random Variables. b 2.

Random Variables. b 2. Random Varables Generally the object of an nvestgators nterest s not necessarly the acton n the sample space but rather some functon of t. Techncally a real valued functon or mappng whose doman s the sample

More information

Problem Set 6 Finance 1,

Problem Set 6 Finance 1, Carnege Mellon Unversty Graduate School of Industral Admnstraton Chrs Telmer Wnter 2006 Problem Set 6 Fnance, 47-720. (representatve agent constructon) Consder the followng two-perod, two-agent economy.

More information

Tests for Two Ordered Categorical Variables

Tests for Two Ordered Categorical Variables Chapter 253 Tests for Two Ordered Categorcal Varables Introducton Ths module computes power and sample sze for tests of ordered categorcal data such as Lkert scale data. Assumng proportonal odds, such

More information

Data Mining Linear and Logistic Regression

Data Mining Linear and Logistic Regression 07/02/207 Data Mnng Lnear and Logstc Regresson Mchael L of 26 Regresson In statstcal modellng, regresson analyss s a statstcal process for estmatng the relatonshps among varables. Regresson models are

More information

Quiz on Deterministic part of course October 22, 2002

Quiz on Deterministic part of course October 22, 2002 Engneerng ystems Analyss for Desgn Quz on Determnstc part of course October 22, 2002 Ths s a closed book exercse. You may use calculators Grade Tables There are 90 ponts possble for the regular test, or

More information

3: Central Limit Theorem, Systematic Errors

3: Central Limit Theorem, Systematic Errors 3: Central Lmt Theorem, Systematc Errors 1 Errors 1.1 Central Lmt Theorem Ths theorem s of prme mportance when measurng physcal quanttes because usually the mperfectons n the measurements are due to several

More information

CS 286r: Matching and Market Design Lecture 2 Combinatorial Markets, Walrasian Equilibrium, Tâtonnement

CS 286r: Matching and Market Design Lecture 2 Combinatorial Markets, Walrasian Equilibrium, Tâtonnement CS 286r: Matchng and Market Desgn Lecture 2 Combnatoral Markets, Walrasan Equlbrum, Tâtonnement Matchng and Money Recall: Last tme we descrbed the Hungaran Method for computng a maxmumweght bpartte matchng.

More information

3/3/2014. CDS M Phil Econometrics. Vijayamohanan Pillai N. Truncated standard normal distribution for a = 0.5, 0, and 0.5. CDS Mphil Econometrics

3/3/2014. CDS M Phil Econometrics. Vijayamohanan Pillai N. Truncated standard normal distribution for a = 0.5, 0, and 0.5. CDS Mphil Econometrics Lmted Dependent Varable Models: Tobt an Plla N 1 CDS Mphl Econometrcs Introducton Lmted Dependent Varable Models: Truncaton and Censorng Maddala, G. 1983. Lmted Dependent and Qualtatve Varables n Econometrcs.

More information

occurrence of a larger storm than our culvert or bridge is barely capable of handling? (what is The main question is: What is the possibility of

occurrence of a larger storm than our culvert or bridge is barely capable of handling? (what is The main question is: What is the possibility of Module 8: Probablty and Statstcal Methods n Water Resources Engneerng Bob Ptt Unversty of Alabama Tuscaloosa, AL Flow data are avalable from numerous USGS operated flow recordng statons. Data s usually

More information

Foundations of Machine Learning II TP1: Entropy

Foundations of Machine Learning II TP1: Entropy Foundatons of Machne Learnng II TP1: Entropy Gullaume Charpat (Teacher) & Gaétan Marceau Caron (Scrbe) Problem 1 (Gbbs nequalty). Let p and q two probablty measures over a fnte alphabet X. Prove that KL(p

More information

CHAPTER 9 FUNCTIONAL FORMS OF REGRESSION MODELS

CHAPTER 9 FUNCTIONAL FORMS OF REGRESSION MODELS CHAPTER 9 FUNCTIONAL FORMS OF REGRESSION MODELS QUESTIONS 9.1. (a) In a log-log model the dependent and all explanatory varables are n the logarthmc form. (b) In the log-ln model the dependent varable

More information

Production and Supply Chain Management Logistics. Paolo Detti Department of Information Engeneering and Mathematical Sciences University of Siena

Production and Supply Chain Management Logistics. Paolo Detti Department of Information Engeneering and Mathematical Sciences University of Siena Producton and Supply Chan Management Logstcs Paolo Dett Department of Informaton Engeneerng and Mathematcal Scences Unversty of Sena Convergence and complexty of the algorthm Convergence of the algorthm

More information

II. Random Variables. Variable Types. Variables Map Outcomes to Numbers

II. Random Variables. Variable Types. Variables Map Outcomes to Numbers II. Random Varables Random varables operate n much the same way as the outcomes or events n some arbtrary sample space the dstncton s that random varables are smply outcomes that are represented numercally.

More information

/ Computational Genomics. Normalization

/ Computational Genomics. Normalization 0-80 /02-70 Computatonal Genomcs Normalzaton Gene Expresson Analyss Model Computatonal nformaton fuson Bologcal regulatory networks Pattern Recognton Data Analyss clusterng, classfcaton normalzaton, mss.

More information

Optimising a general repair kit problem with a service constraint

Optimising a general repair kit problem with a service constraint Optmsng a general repar kt problem wth a servce constrant Marco Bjvank 1, Ger Koole Department of Mathematcs, VU Unversty Amsterdam, De Boelelaan 1081a, 1081 HV Amsterdam, The Netherlands Irs F.A. Vs Department

More information

Robust Stochastic Lot-Sizing by Means of Histograms

Robust Stochastic Lot-Sizing by Means of Histograms Robust Stochastc Lot-Szng by Means of Hstograms Abstract Tradtonal approaches n nventory control frst estmate the demand dstrbuton among a predefned famly of dstrbutons based on data fttng of hstorcal

More information

Creating a zero coupon curve by bootstrapping with cubic splines.

Creating a zero coupon curve by bootstrapping with cubic splines. MMA 708 Analytcal Fnance II Creatng a zero coupon curve by bootstrappng wth cubc splnes. erg Gryshkevych Professor: Jan R. M. Röman 0.2.200 Dvson of Appled Mathematcs chool of Educaton, Culture and Communcaton

More information

EDC Introduction

EDC Introduction .0 Introducton EDC3 In the last set of notes (EDC), we saw how to use penalty factors n solvng the EDC problem wth losses. In ths set of notes, we want to address two closely related ssues. What are, exactly,

More information

Elements of Economic Analysis II Lecture VI: Industry Supply

Elements of Economic Analysis II Lecture VI: Industry Supply Elements of Economc Analyss II Lecture VI: Industry Supply Ka Hao Yang 10/12/2017 In the prevous lecture, we analyzed the frm s supply decson usng a set of smple graphcal analyses. In fact, the dscusson

More information

Notes on experimental uncertainties and their propagation

Notes on experimental uncertainties and their propagation Ed Eyler 003 otes on epermental uncertantes and ther propagaton These notes are not ntended as a complete set of lecture notes, but nstead as an enumeraton of some of the key statstcal deas needed to obtan

More information

Bayesian belief networks

Bayesian belief networks CS 2750 achne Learnng Lecture 12 ayesan belef networks los Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square CS 2750 achne Learnng Densty estmaton Data: D { D1 D2.. Dn} D x a vector of attrbute values ttrbutes:

More information

Games and Decisions. Part I: Basic Theorems. Contents. 1 Introduction. Jane Yuxin Wang. 1 Introduction 1. 2 Two-player Games 2

Games and Decisions. Part I: Basic Theorems. Contents. 1 Introduction. Jane Yuxin Wang. 1 Introduction 1. 2 Two-player Games 2 Games and Decsons Part I: Basc Theorems Jane Yuxn Wang Contents 1 Introducton 1 2 Two-player Games 2 2.1 Zero-sum Games................................ 3 2.1.1 Pure Strateges.............................

More information

AC : THE DIAGRAMMATIC AND MATHEMATICAL APPROACH OF PROJECT TIME-COST TRADEOFFS

AC : THE DIAGRAMMATIC AND MATHEMATICAL APPROACH OF PROJECT TIME-COST TRADEOFFS AC 2008-1635: THE DIAGRAMMATIC AND MATHEMATICAL APPROACH OF PROJECT TIME-COST TRADEOFFS Kun-jung Hsu, Leader Unversty Amercan Socety for Engneerng Educaton, 2008 Page 13.1217.1 Ttle of the Paper: The Dagrammatc

More information

A MODEL OF COMPETITION AMONG TELECOMMUNICATION SERVICE PROVIDERS BASED ON REPEATED GAME

A MODEL OF COMPETITION AMONG TELECOMMUNICATION SERVICE PROVIDERS BASED ON REPEATED GAME A MODEL OF COMPETITION AMONG TELECOMMUNICATION SERVICE PROVIDERS BASED ON REPEATED GAME Vesna Radonć Đogatovć, Valentna Radočć Unversty of Belgrade Faculty of Transport and Traffc Engneerng Belgrade, Serba

More information

Fast Laplacian Solvers by Sparsification

Fast Laplacian Solvers by Sparsification Spectral Graph Theory Lecture 19 Fast Laplacan Solvers by Sparsfcaton Danel A. Spelman November 9, 2015 Dsclamer These notes are not necessarly an accurate representaton of what happened n class. The notes

More information

A Bootstrap Confidence Limit for Process Capability Indices

A Bootstrap Confidence Limit for Process Capability Indices A ootstrap Confdence Lmt for Process Capablty Indces YANG Janfeng School of usness, Zhengzhou Unversty, P.R.Chna, 450001 Abstract The process capablty ndces are wdely used by qualty professonals as an

More information

Solution of periodic review inventory model with general constrains

Solution of periodic review inventory model with general constrains Soluton of perodc revew nventory model wth general constrans Soluton of perodc revew nventory model wth general constrans Prof Dr J Benkő SZIU Gödöllő Summary Reasons for presence of nventory (stock of

More information

TCOM501 Networking: Theory & Fundamentals Final Examination Professor Yannis A. Korilis April 26, 2002

TCOM501 Networking: Theory & Fundamentals Final Examination Professor Yannis A. Korilis April 26, 2002 TO5 Networng: Theory & undamentals nal xamnaton Professor Yanns. orls prl, Problem [ ponts]: onsder a rng networ wth nodes,,,. In ths networ, a customer that completes servce at node exts the networ wth

More information

Single-Item Auctions. CS 234r: Markets for Networks and Crowds Lecture 4 Auctions, Mechanisms, and Welfare Maximization

Single-Item Auctions. CS 234r: Markets for Networks and Crowds Lecture 4 Auctions, Mechanisms, and Welfare Maximization CS 234r: Markets for Networks and Crowds Lecture 4 Auctons, Mechansms, and Welfare Maxmzaton Sngle-Item Auctons Suppose we have one or more tems to sell and a pool of potental buyers. How should we decde

More information

Likelihood Fits. Craig Blocker Brandeis August 23, 2004

Likelihood Fits. Craig Blocker Brandeis August 23, 2004 Lkelhood Fts Crag Blocker Brandes August 23, 2004 Outlne I. What s the queston? II. Lkelhood Bascs III. Mathematcal Propertes IV. Uncertantes on Parameters V. Mscellaneous VI. Goodness of Ft VII. Comparson

More information

MULTIPLE CURVE CONSTRUCTION

MULTIPLE CURVE CONSTRUCTION MULTIPLE CURVE CONSTRUCTION RICHARD WHITE 1. Introducton In the post-credt-crunch world, swaps are generally collateralzed under a ISDA Master Agreement Andersen and Pterbarg p266, wth collateral rates

More information

International ejournals

International ejournals Avalable onlne at www.nternatonalejournals.com ISSN 0976 1411 Internatonal ejournals Internatonal ejournal of Mathematcs and Engneerng 7 (010) 86-95 MODELING AND PREDICTING URBAN MALE POPULATION OF BANGLADESH:

More information

Evaluating Performance

Evaluating Performance 5 Chapter Evaluatng Performance In Ths Chapter Dollar-Weghted Rate of Return Tme-Weghted Rate of Return Income Rate of Return Prncpal Rate of Return Daly Returns MPT Statstcs 5- Measurng Rates of Return

More information

Discounted Cash Flow (DCF) Analysis: What s Wrong With It And How To Fix It

Discounted Cash Flow (DCF) Analysis: What s Wrong With It And How To Fix It Dscounted Cash Flow (DCF Analyss: What s Wrong Wth It And How To Fx It Arturo Cfuentes (* CREM Facultad de Economa y Negocos Unversdad de Chle June 2014 (* Jont effort wth Francsco Hawas; Depto. de Ingenera

More information

4.4 Doob s inequalities

4.4 Doob s inequalities 34 CHAPTER 4. MARTINGALES 4.4 Doob s nequaltes The frst nterestng consequences of the optonal stoppng theorems are Doob s nequaltes. If M n s a martngale, denote M n =max applen M. Theorem 4.8 If M n s

More information

Note on Cubic Spline Valuation Methodology

Note on Cubic Spline Valuation Methodology Note on Cubc Splne Valuaton Methodology Regd. Offce: The Internatonal, 2 nd Floor THE CUBIC SPLINE METHODOLOGY A model for yeld curve takes traded yelds for avalable tenors as nput and generates the curve

More information

Understanding Annuities. Some Algebraic Terminology.

Understanding Annuities. Some Algebraic Terminology. Understandng Annutes Ma 162 Sprng 2010 Ma 162 Sprng 2010 March 22, 2010 Some Algebrac Termnology We recall some terms and calculatons from elementary algebra A fnte sequence of numbers s a functon of natural

More information

Economics 1410 Fall Section 7 Notes 1. Define the tax in a flexible way using T (z), where z is the income reported by the agent.

Economics 1410 Fall Section 7 Notes 1. Define the tax in a flexible way using T (z), where z is the income reported by the agent. Economcs 1410 Fall 2017 Harvard Unversty Yaan Al-Karableh Secton 7 Notes 1 I. The ncome taxaton problem Defne the tax n a flexble way usng T (), where s the ncome reported by the agent. Retenton functon:

More information

Lecture 7. We now use Brouwer s fixed point theorem to prove Nash s theorem.

Lecture 7. We now use Brouwer s fixed point theorem to prove Nash s theorem. Topcs on the Border of Economcs and Computaton December 11, 2005 Lecturer: Noam Nsan Lecture 7 Scrbe: Yoram Bachrach 1 Nash s Theorem We begn by provng Nash s Theorem about the exstance of a mxed strategy

More information

A Constant-Factor Approximation Algorithm for Network Revenue Management

A Constant-Factor Approximation Algorithm for Network Revenue Management A Constant-Factor Approxmaton Algorthm for Networ Revenue Management Yuhang Ma 1, Paat Rusmevchentong 2, Ma Sumda 1, Huseyn Topaloglu 1 1 School of Operatons Research and Informaton Engneerng, Cornell

More information

The convolution computation for Perfectly Matched Boundary Layer algorithm in finite differences

The convolution computation for Perfectly Matched Boundary Layer algorithm in finite differences The convoluton computaton for Perfectly Matched Boundary Layer algorthm n fnte dfferences Herman Jaramllo May 10, 2016 1 Introducton Ths s an exercse to help on the understandng on some mportant ssues

More information

Information Flow and Recovering the. Estimating the Moments of. Normality of Asset Returns

Information Flow and Recovering the. Estimating the Moments of. Normality of Asset Returns Estmatng the Moments of Informaton Flow and Recoverng the Normalty of Asset Returns Ané and Geman (Journal of Fnance, 2000) Revsted Anthony Murphy, Nuffeld College, Oxford Marwan Izzeldn, Unversty of Lecester

More information

Cyclic Scheduling in a Job shop with Multiple Assembly Firms

Cyclic Scheduling in a Job shop with Multiple Assembly Firms Proceedngs of the 0 Internatonal Conference on Industral Engneerng and Operatons Management Kuala Lumpur, Malaysa, January 4, 0 Cyclc Schedulng n a Job shop wth Multple Assembly Frms Tetsuya Kana and Koch

More information

Introduction to PGMs: Discrete Variables. Sargur Srihari

Introduction to PGMs: Discrete Variables. Sargur Srihari Introducton to : Dscrete Varables Sargur srhar@cedar.buffalo.edu Topcs. What are graphcal models (or ) 2. Use of Engneerng and AI 3. Drectonalty n graphs 4. Bayesan Networks 5. Generatve Models and Samplng

More information

Online Appendix for Merger Review for Markets with Buyer Power

Online Appendix for Merger Review for Markets with Buyer Power Onlne Appendx for Merger Revew for Markets wth Buyer Power Smon Loertscher Lesle M. Marx July 23, 2018 Introducton In ths appendx we extend the framework of Loertscher and Marx (forthcomng) to allow two

More information

Interval Estimation for a Linear Function of. Variances of Nonnormal Distributions. that Utilize the Kurtosis

Interval Estimation for a Linear Function of. Variances of Nonnormal Distributions. that Utilize the Kurtosis Appled Mathematcal Scences, Vol. 7, 013, no. 99, 4909-4918 HIKARI Ltd, www.m-hkar.com http://dx.do.org/10.1988/ams.013.37366 Interval Estmaton for a Lnear Functon of Varances of Nonnormal Dstrbutons that

More information

An Application of Alternative Weighting Matrix Collapsing Approaches for Improving Sample Estimates

An Application of Alternative Weighting Matrix Collapsing Approaches for Improving Sample Estimates Secton on Survey Research Methods An Applcaton of Alternatve Weghtng Matrx Collapsng Approaches for Improvng Sample Estmates Lnda Tompkns 1, Jay J. Km 2 1 Centers for Dsease Control and Preventon, atonal

More information

Risk and Return: The Security Markets Line

Risk and Return: The Security Markets Line FIN 614 Rsk and Return 3: Markets Professor Robert B.H. Hauswald Kogod School of Busness, AU 1/25/2011 Rsk and Return: Markets Robert B.H. Hauswald 1 Rsk and Return: The Securty Markets Lne From securtes

More information

UNIVERSITY OF NOTTINGHAM

UNIVERSITY OF NOTTINGHAM UNIVERSITY OF NOTTINGHAM SCHOOL OF ECONOMICS DISCUSSION PAPER 99/28 Welfare Analyss n a Cournot Game wth a Publc Good by Indraneel Dasgupta School of Economcs, Unversty of Nottngham, Nottngham NG7 2RD,

More information

4. Greek Letters, Value-at-Risk

4. Greek Letters, Value-at-Risk 4 Greek Letters, Value-at-Rsk 4 Value-at-Rsk (Hull s, Chapter 8) Math443 W08, HM Zhu Outlne (Hull, Chap 8) What s Value at Rsk (VaR)? Hstorcal smulatons Monte Carlo smulatons Model based approach Varance-covarance

More information

Global Optimization in Multi-Agent Models

Global Optimization in Multi-Agent Models Global Optmzaton n Mult-Agent Models John R. Brge R.R. McCormck School of Engneerng and Appled Scence Northwestern Unversty Jont work wth Chonawee Supatgat, Enron, and Rachel Zhang, Cornell 11/19/2004

More information

STOCHASTIC COALESCENCE IN LOGARITHMIC TIME

STOCHASTIC COALESCENCE IN LOGARITHMIC TIME STOCHASTIC COALESCENCE IN LOGARITHMIC TIME PO-SHEN LOH AND EYAL LUBETZKY Abstract. The followng dstrbuted coalescence protocol was ntroduced by Dahla Malkh n 006 motvated by applcatons n socal networkng.

More information

Consumption Based Asset Pricing

Consumption Based Asset Pricing Consumpton Based Asset Prcng Mchael Bar Aprl 25, 208 Contents Introducton 2 Model 2. Prcng rsk-free asset............................... 3 2.2 Prcng rsky assets................................ 4 2.3 Bubbles......................................

More information

Topics on the Border of Economics and Computation November 6, Lecture 2

Topics on the Border of Economics and Computation November 6, Lecture 2 Topcs on the Border of Economcs and Computaton November 6, 2005 Lecturer: Noam Nsan Lecture 2 Scrbe: Arel Procacca 1 Introducton Last week we dscussed the bascs of zero-sum games n strategc form. We characterzed

More information

arxiv:cond-mat/ v1 [cond-mat.other] 28 Nov 2004

arxiv:cond-mat/ v1 [cond-mat.other] 28 Nov 2004 arxv:cond-mat/0411699v1 [cond-mat.other] 28 Nov 2004 Estmatng Probabltes of Default for Low Default Portfolos Katja Pluto and Drk Tasche November 23, 2004 Abstract For credt rsk management purposes n general,

More information

CHAPTER 3: BAYESIAN DECISION THEORY

CHAPTER 3: BAYESIAN DECISION THEORY CHATER 3: BAYESIAN DECISION THEORY Decson makng under uncertanty 3 rogrammng computers to make nference from data requres nterdscplnary knowledge from statstcs and computer scence Knowledge of statstcs

More information

Quadratic Games. First version: February 24, 2017 This version: December 12, Abstract

Quadratic Games. First version: February 24, 2017 This version: December 12, Abstract Quadratc Games Ncolas S. Lambert Gorgo Martn Mchael Ostrovsky Frst verson: February 24, 2017 Ths verson: December 12, 2017 Abstract We study general quadratc games wth mult-dmensonal actons, stochastc

More information

Minimizing the number of critical stages for the on-line steiner tree problem

Minimizing the number of critical stages for the on-line steiner tree problem Mnmzng the number of crtcal stages for the on-lne stener tree problem Ncolas Thbault, Chrstan Laforest IBISC, Unversté d Evry, Tour Evry 2, 523 place des terrasses, 91000 EVRY France Keywords: on-lne algorthm,

More information

Problems to be discussed at the 5 th seminar Suggested solutions

Problems to be discussed at the 5 th seminar Suggested solutions ECON4260 Behavoral Economcs Problems to be dscussed at the 5 th semnar Suggested solutons Problem 1 a) Consder an ultmatum game n whch the proposer gets, ntally, 100 NOK. Assume that both the proposer

More information

Lecture Note 2 Time Value of Money

Lecture Note 2 Time Value of Money Seg250 Management Prncples for Engneerng Managers Lecture ote 2 Tme Value of Money Department of Systems Engneerng and Engneerng Management The Chnese Unversty of Hong Kong Interest: The Cost of Money

More information

Domestic Savings and International Capital Flows

Domestic Savings and International Capital Flows Domestc Savngs and Internatonal Captal Flows Martn Feldsten and Charles Horoka The Economc Journal, June 1980 Presented by Mchael Mbate and Chrstoph Schnke Introducton The 2 Vews of Internatonal Captal

More information

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE)

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE) ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE) May 17, 2016 15:30 Frst famly name: Name: DNI/ID: Moble: Second famly Name: GECO/GADE: Instructor: E-mal: Queston 1 A B C Blank Queston 2 A B C Blank Queston

More information

Elton, Gruber, Brown, and Goetzmann. Modern Portfolio Theory and Investment Analysis, 7th Edition. Solutions to Text Problems: Chapter 9

Elton, Gruber, Brown, and Goetzmann. Modern Portfolio Theory and Investment Analysis, 7th Edition. Solutions to Text Problems: Chapter 9 Elton, Gruber, Brown, and Goetzmann Modern Portfolo Theory and Investment Analyss, 7th Edton Solutons to Text Problems: Chapter 9 Chapter 9: Problem In the table below, gven that the rskless rate equals

More information

How to Share a Secret, Infinitely

How to Share a Secret, Infinitely How to Share a Secret, Infntely Ilan Komargodsk Mon Naor Eylon Yogev Abstract Secret sharng schemes allow a dealer to dstrbute a secret pece of nformaton among several partes such that only qualfed subsets

More information

Measures of Spread IQR and Deviation. For exam X, calculate the mean, median and mode. For exam Y, calculate the mean, median and mode.

Measures of Spread IQR and Deviation. For exam X, calculate the mean, median and mode. For exam Y, calculate the mean, median and mode. Part 4 Measures of Spread IQR and Devaton In Part we learned how the three measures of center offer dfferent ways of provdng us wth a sngle representatve value for a data set. However, consder the followng

More information

OCR Statistics 1 Working with data. Section 2: Measures of location

OCR Statistics 1 Working with data. Section 2: Measures of location OCR Statstcs 1 Workng wth data Secton 2: Measures of locaton Notes and Examples These notes have sub-sectons on: The medan Estmatng the medan from grouped data The mean Estmatng the mean from grouped data

More information

A DUAL EXTERIOR POINT SIMPLEX TYPE ALGORITHM FOR THE MINIMUM COST NETWORK FLOW PROBLEM

A DUAL EXTERIOR POINT SIMPLEX TYPE ALGORITHM FOR THE MINIMUM COST NETWORK FLOW PROBLEM Yugoslav Journal of Operatons Research Vol 19 (2009), Number 1, 157-170 DOI:10.2298/YUJOR0901157G A DUAL EXTERIOR POINT SIMPLEX TYPE ALGORITHM FOR THE MINIMUM COST NETWORK FLOW PROBLEM George GERANIS Konstantnos

More information

A Utilitarian Approach of the Rawls s Difference Principle

A Utilitarian Approach of the Rawls s Difference Principle 1 A Utltaran Approach of the Rawls s Dfference Prncple Hyeok Yong Kwon a,1, Hang Keun Ryu b,2 a Department of Poltcal Scence, Korea Unversty, Seoul, Korea, 136-701 b Department of Economcs, Chung Ang Unversty,

More information

Linear Combinations of Random Variables and Sampling (100 points)

Linear Combinations of Random Variables and Sampling (100 points) Economcs 30330: Statstcs for Economcs Problem Set 6 Unversty of Notre Dame Instructor: Julo Garín Sprng 2012 Lnear Combnatons of Random Varables and Samplng 100 ponts 1. Four-part problem. Go get some

More information

A Comparison of Statistical Methods in Interrupted Time Series Analysis to Estimate an Intervention Effect

A Comparison of Statistical Methods in Interrupted Time Series Analysis to Estimate an Intervention Effect Transport and Road Safety (TARS) Research Joanna Wang A Comparson of Statstcal Methods n Interrupted Tme Seres Analyss to Estmate an Interventon Effect Research Fellow at Transport & Road Safety (TARS)

More information

Jeffrey Ely. October 7, This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

Jeffrey Ely. October 7, This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. October 7, 2012 Ths work s lcensed under the Creatve Commons Attrbuton-NonCommercal-ShareAlke 3.0 Lcense. Recap We saw last tme that any standard of socal welfare s problematc n a precse sense. If we want

More information

On the Moments of the Traces of Unitary and Orthogonal Random Matrices

On the Moments of the Traces of Unitary and Orthogonal Random Matrices Proceedngs of Insttute of Mathematcs of NAS of Ukrane 2004 Vol. 50 Part 3 1207 1213 On the Moments of the Traces of Untary and Orthogonal Random Matrces Vladmr VASILCHU B. Verkn Insttute for Low Temperature

More information

Simulation Budget Allocation for Further Enhancing the Efficiency of Ordinal Optimization

Simulation Budget Allocation for Further Enhancing the Efficiency of Ordinal Optimization Dscrete Event Dynamc Systems: Theory and Applcatons, 10, 51 70, 000. c 000 Kluwer Academc Publshers, Boston. Manufactured n The Netherlands. Smulaton Budget Allocaton for Further Enhancng the Effcency

More information

Supplementary material for Non-conjugate Variational Message Passing for Multinomial and Binary Regression

Supplementary material for Non-conjugate Variational Message Passing for Multinomial and Binary Regression Supplementary materal for Non-conjugate Varatonal Message Passng for Multnomal and Bnary Regresson October 9, 011 1 Alternatve dervaton We wll focus on a partcular factor f a and varable x, wth the am

More information

Quadratic Games. First version: February 24, 2017 This version: August 3, Abstract

Quadratic Games. First version: February 24, 2017 This version: August 3, Abstract Quadratc Games Ncolas S. Lambert Gorgo Martn Mchael Ostrovsky Frst verson: February 24, 2017 Ths verson: August 3, 2018 Abstract We study general quadratc games wth multdmensonal actons, stochastc payoff

More information

Teaching Note on Factor Model with a View --- A tutorial. This version: May 15, Prepared by Zhi Da *

Teaching Note on Factor Model with a View --- A tutorial. This version: May 15, Prepared by Zhi Da * Copyrght by Zh Da and Rav Jagannathan Teachng Note on For Model th a Ve --- A tutoral Ths verson: May 5, 2005 Prepared by Zh Da * Ths tutoral demonstrates ho to ncorporate economc ves n optmal asset allocaton

More information

Still Simpler Way of Introducing Interior-Point method for Linear Programming

Still Simpler Way of Introducing Interior-Point method for Linear Programming Stll Smpler Way of Introducng Interor-Pont method for Lnear Programmng Sanjeev Saxena Dept. of Computer Scence and Engneerng, Indan Insttute of Technology, Kanpur, INDIA-08 06 October 9, 05 Abstract Lnear

More information

Dynamic Analysis of Knowledge Sharing of Agents with. Heterogeneous Knowledge

Dynamic Analysis of Knowledge Sharing of Agents with. Heterogeneous Knowledge Dynamc Analyss of Sharng of Agents wth Heterogeneous Kazuyo Sato Akra Namatame Dept. of Computer Scence Natonal Defense Academy Yokosuka 39-8686 JAPAN E-mal {g40045 nama} @nda.ac.jp Abstract In ths paper

More information

FORD MOTOR CREDIT COMPANY SUGGESTED ANSWERS. Richard M. Levich. New York University Stern School of Business. Revised, February 1999

FORD MOTOR CREDIT COMPANY SUGGESTED ANSWERS. Richard M. Levich. New York University Stern School of Business. Revised, February 1999 FORD MOTOR CREDIT COMPANY SUGGESTED ANSWERS by Rchard M. Levch New York Unversty Stern School of Busness Revsed, February 1999 1 SETTING UP THE PROBLEM The bond s beng sold to Swss nvestors for a prce

More information

arxiv: v1 [math.nt] 29 Oct 2015

arxiv: v1 [math.nt] 29 Oct 2015 A DIGITAL BINOMIAL THEOREM FOR SHEFFER SEQUENCES TOUFIK MANSOUR AND HIEU D. NGUYEN arxv:1510.08529v1 [math.nt] 29 Oct 2015 Abstract. We extend the dgtal bnomal theorem to Sheffer polynomal sequences by

More information

A Single-Product Inventory Model for Multiple Demand Classes 1

A Single-Product Inventory Model for Multiple Demand Classes 1 A Sngle-Product Inventory Model for Multple Demand Classes Hasan Arslan, 2 Stephen C. Graves, 3 and Thomas Roemer 4 March 5, 2005 Abstract We consder a sngle-product nventory system that serves multple

More information

Physics 4A. Error Analysis or Experimental Uncertainty. Error

Physics 4A. Error Analysis or Experimental Uncertainty. Error Physcs 4A Error Analyss or Expermental Uncertanty Slde Slde 2 Slde 3 Slde 4 Slde 5 Slde 6 Slde 7 Slde 8 Slde 9 Slde 0 Slde Slde 2 Slde 3 Slde 4 Slde 5 Slde 6 Slde 7 Slde 8 Slde 9 Slde 20 Slde 2 Error n

More information