Doubly Random Parallel Stochastic Algorithms for Large Scale Learning

Size: px
Start display at page:

Download "Doubly Random Parallel Stochastic Algorithms for Large Scale Learning"

Transcription

1 Doubly Random Parallel Stochastc Algorthms for Large Scale Learnng Anonymous Author(s Afflaton Address emal Abstract We consder learnng problems over tranng sets n whch both, the number of tranng examples and the dmenson of the feature vectors, are large. To solve these problems we propose the random parallel stochastc algorthm (RAPSA. We call the algorthm random parallel because t utlzes multple processors to operate n a randomly chosen subset of blocks of the feature vector. We call the algorthm parallel stochastc because processors choose elements of the tranng set randomly and ndependently. Algorthms that are parallel n ether of these dmensons exst, but RAPSA s the frst attempt at a methodology that s parallel n both, the selecton of blocks and the selecton of elements of the tranng set. In RAPSA, processors utlze the randomly chosen functons to compute the stochastc gradent component assocated wth a randomly chosen block. The techncal contrbuton of ths paper s to show that ths mnmally coordnated algorthm converges to the optmal classfer when the tranng objectve s convex. In partcular, we show that: ( When usng decreasng stepszes, RAPSA converges almost surely over the random choce of blocks and functons. ( When usng constant stepszes, convergence s to a neghborhood of optmalty wth a rate that s lnear n expectaton. Acelerated (ARAPSA s further proposed by leveragng deas of stochastc quas-newton optmzaton. Both, RAPSA and ARAPSA, are numercally evaluated on the MNIST dgt recognton problem. Introducton Learnng s often formulated as an optmzaton problem that fnds a classfer x R p that mnmzes the average of a loss functon across the elements of a tranng set. For a precse defnton consder a tranng set wth N elements and let f n : R p R be a convex loss functon assocated wth the nth element of the tranng set. The optmal classfer x R p s defned as the mnmzer of the average cost F (x := (/N N n= f n(x, x := argmn F (x := argmn x x N N f n (x. ( Problems such as support vector machnes, logstc regresson, and matrx completon can be put n the form of problem (. In ths paper we are nterested n large scale problems where both, the number of features p and the number of elements N n the tranng set are very large whch arse, e.g., n text [], mage [], and genomc [3] processng. When N and p are large, the parallel processng archtecture n Fgure becomes of nterest. In ths archtecture, features are dvded n B blocks each of whch contans p b p features and a set of I B processors work n paralell on randomly chosen feature blocks whle usng a stocahstc n=

2 x x b x b x B P P P I f f n f n f n f N Fgure : Random parallel stochastc algorthm (RAPSA. At each teraton, processor P pcks a random block from the set {x,..., x B } and a random set of functons from the tranng set {f,..., f N }. The functons drawn are used to evaluate a stochastc gradent component assocated wth the chosen block. RAPSA s shown here to converge to the optmal argument x of (. subset of elements of the tranng set. In the schematc shown, Processor fetches functons f and f n to operate on block x b and Processor fetches functons f n and f n to operate on block x b. Other processors select other elements of the tranng set and other blocks wth the majorty of blocks remanng unchanged and the majorty of functons remanng unused. The blocks chosen for update and the functons fetched for determnaton of block updates are selected ndependently at random n subsequent slots. Problems that operate on blocks of the feature vectors or subsets of the tranng set, but not on both, blocks and subsets, exst. Block coordnate descent (BCD s the generc name for methods n whch the varable space s dvded n blocks that are processed separately. Early versons operate by cyclcally updatng all coordnates at each step [4, 5], whle more recent parallelzed versons of coordnate descent have been developed to accelerate convergence of BCD [6 8]. Closer to the archtecture n Fgure, methods n whch subsets of blocks are selected at random have also been proposed [9]. BCD, seral, parallel, or random, can handle cases where the parameter dmenson p s large but requres access to all tranng samples at each teraton. Methods that utlze a subset of functons are known by the generc name of stochastc approxmaton and rely on the use of stochastc gradents. In plan stochastc gradent descent (SGD, the gradent of the aggregate functon s estmated by the gradent of a randomly chosen functon f n [0]. Snce convergence of SGD s slow more often that not, varous recent developments have been amed at acceleratng convergence. These attempts nclude methodologes to reduce the varance of stochastc gradents [?, SAG, SAGA, SVRG] and the use of deas from quas-newton optmzaton to handle dffcult curvature profles [, ]. More pertnent to the work consdered here are the use of cyclc block SGD updates [3] and the explotaton of sparsty propertes of feature vectors to allow for parallel updates [4]. These methods are sutable when the number of elements n the tranng set N s large but don t allow for parallel feature processng unless parallelsm s nherent to the problem s structure. The random parallel stochastc algorthm (RAPSA proposed n ths paper represents the frst effort at mplementng the archtecture n Fgure that randomzes over both, features and sample functons. In RAPSA, the functons fetched by a processor are used to compute the stochastc gradent component assocated wth a randomly chosen block (Secton. The processors do not coordnate n ether choce except to avod selecton of the same block. Our man techncal contrbuton s to show that RAPSA terates converge to the optmal classfer x when usng a sequence of decreasng stepszes and to a neghborhood of the optmal classfer when usng constant stepszes (Secton.. In the latter case, we further show that the rate of convergence to ths optmalty neghborhood s lnear n expectaton. These results are nterestng because only a subset of features are updated per teraton and the functons used to update dfferent blocks are, n general, dfferent. We further propose an acceleraton of RAPSA n whch processors also update and explot a curvature estmaton matrx assocated wth each block (Secton 3. Ths accelerated (ARAPSA algorthm leverages deas of stochastc quas-newton methods [, ] and results n faster convergence. Both, RAPSA and ARAPSA, are numercally evaluated on the MNIST dgt recognton problem (Secton 4.

3 Random Parallel Stochastc Algorthm (RAPSA We consder a more general formulaton of ( n whch the number N of functons f n s not necessarly fnte. Introduce then a random varable θ Θ R q that determne the choce of the random smooth convex functon f(, θ : R p R. We consder the problem of mnmzng the expectaton of the random functons F (x := E θ [f(x, θ], x := argmn x F (x := argmn E θ [f(x, θ]. ( x Problem ( s a partcular case of ( n whch each of the functons f n s drawn wth probablty /N. We refer to f(, θ as nstantaneous functons and to F (x as the average functon. RAPSA utlzes I processors to update a random subset of blocks of the varable x, wth each of the blocks relyng on a subset of randomly and ndependently chosen elements of the tranng set; see Fgure. Formally, decompose the varable x nto B blocks to wrte x = [x ;... ; x B ], where block b has length p b so that we have x b R p b. At teraton t, processor selects a random ndex b t for updatng and a random subset Θ t of L nstantaneous functons. It then uses these nstantaneous functons to determne stochastc gradent components for the subset of varables x b = x b t as an average of the components of the gradents of the functons f(x t, θ for θ Θ t, xb f(x t, Θ t = xb f(x t, θ, b = b t L. (3 θ Θ t The stochastc gradent block n (3 s then modulated by a possbly tme varyng stepsze γ t and used by processor to update the block x b = x b t x t+ b = x t b γ t xb f(x t, Θ t b = b t. (4 RAPSA s defned by the jont mplementaton of (3 and (4 across all I processors. The selecton of blocks s coordnated so that no processors operate n the same block. The selecton of elements of the tranng set s uncoordnated across processors. The fact that at any pont n tme a random subset of blocks s beng updated utlzng a random subset of elements of the tranng set means that RAPSA requres almost no coordnaton between processors. The contrbuton of ths paper s to show that ths very lean algorthm converges to the optmal argument x as we show n the followng secton.. Convergence Analyss We show n ths secton that the sequence of objectve functon values F (x t generated by RAPSA approaches the optmal objectve functon value F (x. In establshng ths result we defne the set S t contanng the blocks that are updated at step t wth assocated ndces I t {,..., B}. Note that components of the set S t are chosen unformly at random from the set of blocks {x,..., x B }. The defnton of S t s such that the tme evoluton of RAPSA terates can be wrtten as, [cf. (4], x t+ = x t γ t x f(x t, Θ t for all x S t, (5 whle the rest of the blocks reman unchanged,.e., x t+ = x t for x / S t. Snce the number of updated blocks s equal to the number of processors, the rato of updated blocks s r := I t /B = I/B. To prove convergence of RAPSA, we requre the followng assumptons Assumpton. The nstantaneous objectve functons f (x are dfferentable and the average functon F (x s strongly convex wth parameter m > 0. Assumpton. The average objectve functon gradents F (x are Lpschtz contnuous wth respect to the Eucldan norm wth parameter M. I.e., for all x, ˆx R p, t holds F (x F (ˆx M x ˆx. (6 Assumpton 3. The second moment of the norm of the stochastc gradent s bounded for all x,.e., there exsts a constant K such that for all varables x, t holds E θ [ f(x t, θ t x t ] K. (7 3

4 Notce that Assumpton only enforces strong convexty of the average functon F, whle the nstantaneous functons f may not be even convex. Further, notce that snce the nstantaneous functons f are dfferentable the average functon F s also dfferentable. The Lpschtz contnuty of the average functon gradents F s customary n provng objectve functon convergence for descent algorthms. The restrcton mposed by Assumpton 3 s a standard condton n stochastc approxmaton lterature [0], ts ntent beng to lmt the varance of the stochastc gradents [5]. Our frst result comes n the form of a expected descent lemma that relates the expected dfference of subsequent terates to the gradent of the average functon. Lemma. Consder the random parallel stochastc algorthm defned n (3-(5. Recall the defntons of the set of updated blocks I t whch are randomly chosen from the total B blocks. Defne F t as a sgma algebra that measures the hstory of the system up untl tme t. Then, the expected value of the dfference x t+ x t wth respect to the random set I t gven F t s E I t[ x t+ x t F t] = rγ t f(x t, Θ t. (8 Moreover, the expected value of the squared norm x t+ x t wth respect to the random set S t gven F t can be smplfed as E I t[ x t+ x t F t] = r(γ t f(x t, Θ t. (9 Notce that n the regular stochastc gradent descent method the dfference of two consecutve terates x t+ x t s equal to the stochastc gradent f(x t, Θ t tmes the stepsze γ t. Accordng to the frst result n Lemma, the expected value of stochastc gradents wth respect to the random set of blocks I t s the same as the one for SGD except that t s multpled by the fracton of updated blocks r. Expresson n (9 shows the same relaton for the expected value of the squared dfference x t+ x t. These relatonshps confrm that n expectaton RAPSA behaves as SGD whch allows us to establsh the global convergence of RAPSA. Proposton. Consder the random parallel stochastc algorthm defned n (3- (5. If Assumptons -3 hold true, then the objectve functon error sequence F (x t F (x satsfes E [ F (x t+ F (x F t] ( mrγ t ( F (x t F (x + rmk(γt. (0 Proposton leads to a supermartngale relatonshp for the sequence of objectve functon errors F (x t F (x. In the followng theorem we show that f the sequence of stepsze satsfes standard stochastc approxmaton dmnshng step-sze rules (non-summable and squared summable, the sequence of objectve functon errors F (x t F (x converges to null almost surely. Consderng the strong convexty assumpton ths result mples almost sure convergence of the sequence x t x to null. Theorem. Consder the random parallel stochastc algorthm defned n (3-(5. If Assumptons -3 hold true and the sequence of stepszes are non-summable t=0 γt = and square summable t=0 (γt <, then sequence of the varables x t generated by RAPSA converges almost surely to the optmal argument x, lm t xt x = 0 a.s. ( Moreover, f stepsze s defned as γ t := γ 0 T 0 /(t+t 0 and the stepsze parameters are chosen such that mrγ 0 T 0 >, then the expected average functon error E [F (x t F (x ] converges to null at least wth a sublnear convergence rate of order O(/t, E [ F (x t F (x ] C t + T 0, ( where the constant C s defned as { rmk(γ 0 T 0 } C = max 4mrγ 0 T 0, T 0 (F (x 0 F (x. (3 The result n Theorem shows that when the sequence of stepsze s dmnshng as γ t = γ 0 T 0 /(t+ T 0, the average objectve functon value F (x t sequence converges to the optmal objectve value 4

5 F (x wth probablty. Further, the rate of convergence n expectaton s at least n the order of O(/t. Dmnshng stepszes are useful when exact convergence s requred, however, for the case that we are nterested n a specfc accuracy ɛ the more effcent choce s usng a constant stepsze. In the followng theorem we study the convergence propertes of RAPSA for a constant stepsze γ t = γ. Theorem. Consder the random parallel stochastc algorthm defned n (3 and (5. If Assumptons -3 hold true and the stepsze s constant γ t = γ, then the sequence of the varables x t generated by RAPSA converges almost surely to a neghborhood of the optmal argument x as lm nf t F (xt F (x γmk 4m a.s. (4 Moreover, f the constant stepsze γ s chosen such that mrγ < then the expected average functon value error E [F (x t F (x ] converges lnearly to an error bound as E [ F (x t F (x ] ( mγr t (F (x 0 F (x + γmk 4m. (5 Notce that accordng to the result n (5 there exts a trade-off between accuracy and speed of convergence. Decreasng the constant stepsze γ leads to a smaller error bound γmk/4m and a more accurate convergence, whle the lnear convergence constant ( mγr ncreases and the convergence rate becomes slower. Further, note that the error of convergence γm K/4m s ndependent of the rato of updated blocks r, whle the constant of lnear convergence mγr depends on r. Therefore, updatng a fracton of the blocks at each teraton decreases the speed of convergence for RAPSA relatve to SGD that updates all of the blocks, however, both of the algorthms reach the same accuracy. To acheve accuracy ɛ the sum of two terms n the rght hand sde of (5 should be smaller than ɛ. Let s consder φ as a postve constant that s strctly smaller than,.e., 0 < φ <. Then, we want to have γmk 4m φɛ, ( mγrt (F (x 0 F (x ( φɛ. (6 Therefore, to satsfy the frst condton n (6 we set the stepsze as γ = 4mφɛ/MK. Apply ths substtuton nto the second nequalty n (6 and consder the nequalty a + ln( a < 0 for 0 < a <, to obtan that t MK 8m rφɛ ln ( F (x 0 F (x ( φɛ. (7 The lower bound n (7 shows the mnmum number of requred teratons for RAPSA to acheve accuracy ɛ. 3 Accelerated Random Parallel Stochastc Algorthm (ARAPSA As we mentoned n Secton, RAPSA operates on frst-order nformaton whch may lead to slow convergence n ll-condtoned problems. We ntroduce Accelerated RAPSA (ARAPSA as a parallel doubly stochastc algorthm that ncorporates second-order nformaton of the objectve by separately approxmatng the functon curvature for dfferent blocks. We do ths by mplementng the olbfgs algorthm for dfferent blocks of the varable x. Defne ˆB t as an approxmaton for the Hessan nverse of the objectve functon that corresponds to the block x. The update of ARAPSA s defned as multplcaton of the descent drecton of RAPSA by ˆB t,.e. x t+ = x t γ t ˆBt x f(x t, Θ t for all x S t. (8 We next detal how to properly specfy the block approxmate Hessan ˆB t so that t behaves n a manner comparable to the true Hessan. To do so, defne for each block coordnate x at step t the varable varaton v t and the stochastc gradent varaton ˆrt as v t = x t+ x t, ˆr t = x f(x t+, Θ t x f(x t, Θ t. (9 The expectaton on the left hand sde of (, and throughout the subsequent convergence rate analyss, s taken wth respect to the algorthm hstory F 0, whch contans all randomness n both Θ t and I t for all t 0. 5

6 Observe that the stochastc gradent varaton ˆr t s defned as the dfference of stochastc gradents at tmes t + and t correspondng to the block x for a common set of realzatons Θ t. The term x f(x t, Θ t s the same as the stochastc gradent used at tme t n (8, whle x f(x t+, Θ t s computed only to determne the stochastc gradent varaton ˆr t. An alternatve and perhaps more natural defnton for the stochastc gradent varaton s x f(x t+, Θ t+ x f(x t, Θ t. However, as ponted out n [], ths formulaton s nsuffcent for establshng the convergence of stochastc quas-newton methods. We proceed to developng a block-coordnate quas-newton method by frst notng an mportant property of the true Hessan, and desgn our approxmate scheme to satsfy ths property. In partcular, observe that the true Hessan nverse (H t correspondng to block x satsfes the block secant condton, stated as (H t v t = ˆrt when the terates xt and xt+ are close to each other. The secant condton may be nterpreted as statng that the stochastc gradent of a quadratc approxmaton of the objectve functon evaluated at the next teraton agrees wth the stochastc gradent at the current teraton. We select a Hessan nverse approxmaton matrx assocated wth block x such that t satsfes the secant condton a comparable manner to the true block Hessan. ˆB t+ v t = ˆrt, and thus behaves n The olbfgs Hessan nverse update rule mantans the secant condton at each teraton by usng nformaton of the last τ pars of varable and stochastc gradent varatons {v u, ˆru }t u=t τ. To state the update rule of olbfgs for revsng the Hessan nverse approxmaton matrces of the t,0 blocks, defne a matrx as ˆB := η ti for each block and t, where the constant ηt for t > 0 s gven by η t := (vt T ˆr t ˆr t, (0 whle the ntal value s η t t,0 =. The matrx ˆB s the ntal approxmate for the Hessan nverse assocated wth block x. The approxmate matrx ˆB t t,0 s computed by updatng the ntal matrx ˆB usng the last τ pars of curvature nformaton {v u, ˆru }t u=t τ. We defne the approxmate Hessan nverse ˆB t t,τ = ˆB correspondng to block x at step t as the outcome of τ recursve applcatons of the update ˆB t,u+ where the matrces Ẑt τ+u ˆρ t τ+u = = (Ẑt τ+u (v t τ+u T ˆr t τ+u T ˆBt,u (Ẑt τ+u and the constants ˆρ t τ+u and Ẑ t τ+u + ˆρ t τ+u (v t τ+u (v t τ+u T, ( n ( for u = 0,..., τ are defned as = I ˆρ t τ+u ˆr t τ+u (v t τ+u T. ( The computaton cost of ˆB t n ( s n the order of O(p, however, for the update n (8 the descent drecton ˆd t := ˆB t x f(x t, Θ t s requred. [6] ntroduces an effcent mplementaton of product ˆB t x f(x t, Θ t that requres computaton complexty of order O(τp. We use the same dea for computng the descent drecton of ARAPSA for each block more detals are provded n supplementary materals. Therefore, the computaton complexty of updatng each block for ARAPSA s n the order of O(τp, whle RAPSA requres O(p operatons. On the other hand, ARAPSA accelerates the convergence of RAPSA by ncorporatng the second order nformaton of the objectve functon for the block updates, as may be observed n the numercal analyses provded n Secton 4. 4 Numercal analyss In ths secton we study the practcal performance of the doubly stochastc approxmaton algorthms developed n Sectons and 3 by consderng the problem of developng an automated decson system to dstngush between dstnct hand-wrtten dgts. For that purpose, let z R p be a feature vector encodng pxel ntenstes of an mage, and let y {, } be an ndcator varable of whether the mage contans the dgt 0 or 8, n whch case the bnary ndcator s respectvely y = or y =. We model the task of learnng a hand-wrtten dgt detector as a logstc regresson problem, where one ams to tran a classfer x R p to determne the relatonshp between feature vectors z n R p and ther assocated labels y n {, } for n =,..., N. The nstantaneous functons f n n ( may be wrtten as the log-lkelhood of a generalzed lnear model of the odds rato of 6

7 F(xt F(x, Emprcal Rsk pt, number features processed (a Emprcal rsk F (x t F (x vs. teraton p t P(Ŷu = Yu, Average Accuracy t t u= pt, number features processed (b Classfcaton accuracy vs. teraton p t Fgure : RAPSA on MNIST data wth constant step-sze γ = 0, wth no mn-batchng L =. Algorthm performance s comparable across dfferent numbers of decson varable coordnates updated per teraton. RAPSA s I tmes faster than SGD and acheves comparable performance. In some cases, updatng fewer varables per teraton mproves accuracy. whether the label s y n = or y n =. The emprcal rsk mnmzaton assocated wth tranng set T = {(z n, y n } N n= s to fnd x as the maxmum lkelhood estmate, x := argmn F (x = λ x R p x + N log( + exp( y n x T z n. (3 N We use the MNIST dataset [7], n whch feature vectors z n R p are p = 8 8 pxel mages whose values are recorded as ntenstes, or elements of the unt nterval [0, ]. Consdered here s the subset assocated wth dgts 0 and 8, a tranng set T = {z n, y n } N n= wth N = sample ponts. Further, the optmal classfer x n (3 s computed usng Lblnear [8]. We run RAPSA on ths tranng subset for the case that B = [p/4] = 96, where [ ] denotes roundng a number to ts nearest nteger. Moreover, we unformly allocate feature vectors nto blocks of sze p = 4 for =,..., B. To determne the advantages of ncomplete randomzed parallel processng, we vary I t = I, the number of blocks updated at each teraton, set to the number of processors for smplcty, as I = B, I = [B/] = 98, I = [B/4] = 49, I = [B/8] = 5. When I = B, RAPSA s parallel stochastc gradent descent. In the subsequent experment, we set L = (no mn-batchng. Comparng algorthm performance over teraton t across varyng numbers of blocks updates I t s unfar. If RAPSA s run on a problem for whch I t = B/, then at teraton t t has only processed half the data that parallel SGD has processed by the same teraton. Instead we consder the algorthm performance as compared wth the amount of features processed up to the current tme p t. Observe that for a feature vector of length p, tp features have been processed by teraton t when all decson varable coordnates are updated at each teraton, as n the case of SGD where r =. When r <, p t must be scaled by the proporton of coordnates updated per teraton, whch snce blocks are unformly szed, s prt. Moreover, when the mn-batch sze L >, p t = prtl. In Fgure we show the result of runnng RAPSA wth constant step-sze γ t = 0. In Fgure (a, we plot F (x t F (x versus the number of features processed p t. As n Theorem, we observe n Fgure (a that the emprcal rsk F (x t F (x converges to a small postve constant of approxmately 0 and the convergence rate dfference to a neghborhood of the optmum s neglgble. In Fgure (b we plot the emprcal average classfcaton accuracy on a test set of sze Ñ = Note that RAPSA wth fewer blocks updated (processors per teraton acheves mproved classfcaton accuracy,.e. I = B, I = [B/], I = [B/4], I = [B/8] respectvely acheve accuraces of 78%, 87%, 9%, 9%. We now run Accelerated RAPSA, or ARAPSA as stated n Secton 3 for ths problem settng for the entre MNIST bnary tranng subset assocated wth dgts 0 and 8, wth mn-batch sze L = 0 and the level of curvature nformaton set as τ = 0. We select decreasng step-sze γ t = γ 0 T 0 /(t+t 0 wth annealng rate T 0 = [N/p] = 45, regularzer λ = / N = , and γ 0 = /(λt 0. As before, we study the advantages of ncomplete randomzed parallel processng by varyng I = I t, the number of blocks updated at per teraton, as I {B, [B/], [B/4], [B/8]}. Fg- n= 7

8 F(xt F(x, Emprcal Rsk pt, number features processed (a Emprcal rsk F (x t F (x vs. teraton p t P(Ŷu = Yu, Average Accuracy t t u= pt, number features processed (b Classfcaton accuracy vs. teraton p t Fgure 3: ARAPSA on MNIST data wth regularzer λ = / N = , mn-batch sze L = 0, curvature nformaton level τ = 0, and dmnshng step-sze γ t = γ 0 T 0 /(t + T 0 wth annealng rate T 0 = [N/p] = 45. ARAPSA convergence propertes hold n practce. F(xt F(x, Emprcal Rsk RAPSA ARAPSA pt, number features processed (a Emprcal rsk F (x t F (x vs. teraton p t P(Ŷu = Yu, Average Accuracy t t u= RAPSA ARAPSA pt, number features processed (b Classfcaton accuracy vs. teraton p t Fgure 4: RAPSA and ARAPSA algorthms on MNIST data wth mn-batch sze L = 0, the level of curvature nformaton set to τ = 0, and constant step-sze γ = 0. ARAPSA successfully accelerates the convergence of RAPSA, and acheves hgher accuracy per features processed p t. ure 3(a shows the error F (x t F (x versus the number of features processed p t. Observe that the algorthm acheves convergence across the dfferng numbers of parallel computng nodes I t, wth mproved convergence wth smaller I t,.e. for I t = B, I t = [B/], I t = [B/4], I t = [B/8], the algorthm respectvely acheves F (x t F (x by p t = 70, p t = 40, p t = 85, and p t = 77. Moreover, n Fgure 3(b we observe ARAPSA acheves comparable accuracy across the dfferent levels of block varables updated per teraton, surpassng 90%. We study the effect of ncorporatng second-order nformaton nto the block-updates. To do so, we fx mn-batch sze as L = 0 and run ARAPSA and RAPSA for ths problem nstance wth constant step-sze γ = 0. Moreover, only a quarter of the blocks per teraton are updated per teraton I t = [B/4]. The results of ths experment are gven n Fgure 4. In Fgure 4(a we plot F (x t F (x versus the number of features processed p t. Observe that ARAPSA converges more quckly and to a pont closer to the optmum than RAPSA,.e. for the benchmark F (x t F (x, ARAPSA requres p t = 85 whle RAPSA requres p t = 575 features. Ths accelerated behavor s also apparent n Fgure 4(b where ARAPSA acheves an accuracy of 80% by p t = 30, whereas RAPSA requres p t = 650 features for the same benchmark. The comparable performance of RAPSA to SGD n Fgure (a, and ARAPSA to olbfgs n Fgure 3, demonstrates the practcal utlty of the proposed method. Because RAPSA may be mplemented n parallel, t may acheve the same emprcal result as SGD but at a rate of I tmes faster than SGD. Moreover, n some cases RAPSA acheves superor classfcaton accuracy wth fewer blocks updated per teraton. A natural queston, gven the speedup made possble by parallel computng, s why to select I < B. For stuatons where p s very large, ths would requre a large number of computng nodes, whch may or may not be avalable. 8

9 References [] G. Sampson, R. Hagh, and E. Atwell, Natural language analyss by stochastc optmzaton: A progress report on project aprl, J. Exp. Theor. Artf. Intell., vol., no. 4, pp. 7 87, Oct [Onlne]. Avalable: [] J. Maral, F. Bach, J. Ponce, and G. Sapro, Onlne learnng for matrx factorzaton and sparse codng, The Journal of Machne Learnng Research, vol., pp. 9 60, 00. [3] M. Taşan, G. Musso, T. Hao, M. Vdal, C. A. MacRae, and F. P. Roth, selectng causal genes from genome-wde assocaton studes va functonally coherent subnetworks, Nature methods, 04. [4] P. Tseng and C. O. L. Mangasaran, Convergence of a block coordnate descent method for nondfferentable mnmzaton, J. Optm Theory Appl, pp , 00. [5] Y. Xu and W. Yn, A globally convergent algorthm for nonconvex optmzaton based on block coordnate update, arxv preprnt arxv:40.386, 04. [6] P. Rchtárk and M. Takáč, Parallel coordnate descent methods for bg data optmzaton, arxv preprnt arxv:.0873, 0. [7] Z. Lu and L. Xao, On the complexty analyss of randomzed block-coordnate descent methods, Mathematcal Programmng, pp. 8, 03. [8] Y. Nesterov, Effcency of coordnate descent methods on huge-scale optmzaton problems, SIAM Journal on Optmzaton, vol., no., pp , 0. [9] J. Lu, S. J. Wrght, C. Ré, V. Bttorf, and S. Srdhar, An asynchronous parallel stochastc coordnate descent algorthm, arxv preprnt arxv:3.873, 03. [0] H. Robbns and S. Monro, A stochastc approxmaton method, Ann. Math. Statst., vol., no. 3, pp , [Onlne]. Avalable: [] N. Schraudolph, J. Yu, and S. Günter, A stochastc quas-newton method for onlne convex optmzaton, 007. [] A. Bordes, L. Bottou, and P. Gallnar, Sgd-qn: Careful quas-newton stochastc gradent descent, The Journal of Machne Learnng Research, vol. 0, pp , 009. [3] Y. Xu and W. Yn, Block stochastc gradent teraton for convex and nonconvex optmzaton, ArXv preprnt v, Aug. 04. [4] F. Nu, B. Recht, C. R, and S. J. Wrght, Hogwld: A lock-free approach to parallelzng stochastc gradent descent, n In NIPS, 0. [5] A. Nemrovsk, A. Judtsky, and A. Shapro, Robust stochastc approxmaton approach to stochastc programmng, SIAM Journal on optmzaton, vol. 9, no. 4, pp , 009. [6] L. Dong C. and J. Nocedal, On the lmted memory bfgs method for large scale optmzaton, Mathematcal programmng, no. 45(-3, pp , 989. [7] Y. Lecun and C. Cortes, The MNIST database of handwrtten dgts. [Onlne]. Avalable: [8] R.-E. Fan, K.-W. Chang, C.-J. Hseh, X.-R. Wang, and C.-J. Ln, Lblnear: A lbrary for large lnear classfcaton, The Journal of Machne Learnng Research, vol. 9, pp ,

10 A Proof of Lemma Recall that the components of vector x t+ are equal to the components of x t for the coordnates that are not updated at step t,.e., / I t. For the updated coordnates I t we know that x t+ = x t γt x t f(x t, θ t. Therefore, B I blocks of the vector x t+ x t are 0 and the remanng I randomly chosen blocks are gven by γ t x t f(x t, θ t. Notce that there are ( B I dfferent ways for pckng I blocks out of the whole B blocks. Therefore, the probablty of each combnaton of blocks s / ( ( B I. Further, each block appears n B I of the combnatons. Therefore, the expected value can be wrtten as E I t[ x t+ x t F t] = ( B I Observe that smplfyng the rato n the rght hand sdes of (4 leads to ( B I ( B I = (B! (I! (B I! p! I! (B I! ( m I ( γ t f(x t, Θ t (4 = I = r. (5 B Substtutng the smplfcaton n (5 nto (4 follows the clam n (8. To prove the clam n (9 we can use the same argument that we used n provng (8 to show that E I t[ xt+ x t F t] = ( B I By substtutng the smplfcaton n (5 nto (6 the clam n (9 follows. B Proof of Proposton ( B I (γ t f(x t, Θ t. (6 By consderng the Taylor s expanson of F (x t+ near the pont x t and observng the Lpschtz contnuty of gradents F wth constant M we obtan that the average objectve functon F (x t+ s bounded above by F (x t+ F (x t + F (x t T (x t+ x t + M xt+ x t. (7 Compute the expectaton of the both sdes of (7 wth respect to the random set I t gven the observed set of nformaton F t. Substtute E I t[ x t+ x t F t] and E I t[ x t+ x t F t] wth ther smplfcatons n (8 and (9, respectvely, to wrte E I t [ F (x t+ F t] F (x t rγ t F (x t T f(x t, Θ t + rm(γt f(x t, Θ t. (8 Notce that the stochastc gradent f(x [ t, Θ t s an unbased estmate of the average functon gradent F (x t. Therefore, we obtan E Θ t f(x t, Θ t F t] = F (x t. Observng ths relaton and consderng the assumpton n (7, the expected value of (8 wth respect to the set of realzatons Θ t can be wrtten as [ E It,Θ t F (x t+ F t] F (x t rγ t F (x t + rm(γt K. (9 Subtractng the optmal objectve functon value F (x form the both sdes of (9 mples that [ E I t,θ t F (x t+ F (x F t] F (x t F (x rγ t F (x t + rm(γt K. (30 We proceed to fnd a lower bound for the gradent norm F (x t n terms of the objectve value error F (x t F (x. Assumpton states that the average objectve functon F s strongly convex wth constant m > 0. Therefore, for any y, z R p we can wrte F (y F (z + F (z T (y z + m y z. (3 0

11 For fxed z, the rght hand sde of (3 s a quadratc functon of y whose mnmum argument we can fnd by settng ts gradent to zero. Dong ths yelds the mnmzng argument ŷ = z (/m F (z mplyng that for all y we must have F (y F (w + F (z T (ŷ z + m ŷ z = F (z m F (z. (3 Observe that the bound n (3 holds true for all y and z. Settng values y = x and z = x t n (3 and rearrangng the terms yelds a lower bound for the squared gradent norm F (x t as F (x t m(f (x t F (x (33 Substtutng the lower bound n (33 by the norm of gradent square F (x t n (30 follows the clam n (0. C Proof of Theorem We use the relatonshp n (0 to buld a supermartngale sequence. To do so, defne the stochastc process α t as α t := F (x t F (x + rmk (γ u. (34 Note that α t s well-defned because u=t (γu u=0 (γu < s summable. Further defne the sequence β t wth values u=t β t := mγ t r(f (x t F (x. (35 The defntons of sequences α t and β t n (34 and (35, respectvely, and the nequalty n (0 mply that the expected value α t+ gven F t can be wrtten as E [ α t+ F t] α t β t. (36 Snce the sequences α t and β t are nonnegatve t follows from (36 that they satsfy the condtons of the supermartngale convergence theorem. Therefore, we obtan that: ( The sequence α t converges almost surely to a lmt. ( The sum t=0 βt < s almost surely fnte. The latter result yelds mγ t r(f (x t F (x <. a.s. (37 t=0 Snce the sequence of step szes s non-summable there exts a subsequence of sequence F (x t F (x whch s convergng to null. Ths observaton s equvalent to almost sure convergence of lm nf F (x t F (x to null lm nf t F (xt F (x = 0. a.s. (38 Based on the martngale convergence theorem for the sequences α t and β t n relaton (36, the sequence α t almost surely converges to a lmt. Consder the defnton of α t n (34. Observe that the sum u=t (γu s determnstc and ts lmt s null. Therefore, the sequence of the objectve functon value error F (x t F (x almost surely converges to a lmt. Ths observaton n assocaton wth the result n (38 mples that the whole sequence of F (x t F (x converges almost surely to null, lm F t (xt F (x = 0. a.s. (39 The last step s to prove almost sure convergence of the sequence x t x to null, as a result of the lmt n (39. To do so, we follow by provng a lower bound for the objectve functon value error F (x t F (x n terms of the squared norm error x t x. Accordng to the strong convexty assumpton, we can wrte the followng nequalty F (x t F (x + F (x T (x t x + m xt x (40

12 Observe that the gradent of the optmal pont s the null vector,.e., F (x = 0. Ths observaton and rearrangng the terms n (40 mply that F (x t F (x m xt x. (4 The upper bound n (4 for the squared norm x t x n assocaton wth the fact that the sequence F (x t F (x almost surely converges to null, leads to the concluson that the sequence x t x almost surely converges to zero. Hence, the clam n ( s vald. The next step s to study the convergence rate of RAPSA n expectaton. In ths step we assume that the dmnshng stepsze s defned as γ t = γ 0 T 0 /(t + T 0. Recall the nequalty n (0. Substtute γ t by γ 0 T 0 /(t + T 0 and compute the expected value of (0 gven F 0 to obtan E [ F (x t+ F (x ] ( mrγ0 T 0 (t + T 0 E [ F (x t F (x ] + rmk(γ0 T 0 (t + T 0. (4 We use the followng lemma to show that the result n (4 mples sublnear convergence of the sequence of expected objectve value error E [F (x t F (x ]. Lemma. Let c >, b > 0 and t 0 > 0 be gven constants and u t 0 be a nonnegatve sequence that satsfes the nequalty ( u t+ c t + t 0 u t b + (t + t 0, (43 for all tmes t 0. The sequence u t s then bounded as u t Q t + t 0, (44 for all tmes t 0, where the constant Q s defned as Q := max{b/(c, t 0 u 0 }. Proof: See... Lemma shows that f a sequence u t satsfes the condton n (43 then the sequence u t converges to null at least wth the rate of O(/t. By assgnng values t 0 = T 0, u t = E [F (x t F (x ], c = mrγ 0 T 0, and b = rmk(γ 0 T 0 /, the relaton n (4 mples that the nequalty n (43 s satsfed for the case that mrγ 0 T 0 >. Therefore, the result n (44 holds and we can conclude that E [ F (x t F (x ] C t + T 0, (45 where the constant C s defned as { rmk(γ 0 T 0 } C = max 4rmγ 0 T 0, T 0 (F (x 0 F (x (46 D Proof of Theorem To prove the clam n (4 we use the relatonshp n (0 to construct a supermartngale. Defne the stochastc process α t wth values α t = ( F (x t F (x ( mn F u t (xu F (x > γmk (47 4m The process α t tracks the optmalty gap F (x t F (x untl the gap becomes smaller than γmk/m for the frst tme at whch pont t becomes α t = 0. Notce that the stochastc process α t s always non-negatve,.e., α t 0. Lkewse, we defne the stochastc process β t as β t = γmr ( F (x t F (x γmk 4m ( mn F u t (xu F (x > γmk 4m, (48 whch follows γmr (F (x t F (x γmk/4m untl the tme that the optmalty gap F (x t F (x becomes smaller than γmk/m for the frst tme. After ths moment the stochastc process

13 β t becomes null. Accordng to the defnton of β t n (48, the stochastc process satsfes β t 0 for all t 0. Based on the relatonshp (0 and the defntons of stochastc processes α t and β t n (47 and (48 we obtan that for all tmes t 0 E [ α t+ F t] α t β t. (49 To check the valdty of (49 we frst consder the case that mn u t F (x u F (x > γmk/4m holds. In ths scenaro we can smply the stochastc processes n (47 and (48 as α t = F (x t F (x and β t = γmr (F (x t F (x γmk/4m. Therefore, accordng to the nequalty n (0 the result n (49 s vald. The second scenaro that we check s mn u t F (x u F (x γmk/4m. Based on the defntons of stochastc processes α t and β t, both of these two sequences are equal to 0. Further, notce that when α t = 0, t follows that α t+ = 0. Hence, the relatonshp n (49 s true. Gven the relaton n (49 and non-negatvty of stochastc processes α t and β t we obtan that α t s a supermartngale. The supermartngale convergence theorem yelds: The sequence α t converges to a lmt almost surely. The sum t= βt s fnte almost surely. The latter result mples that the sequence β t s convergng to null almost surely. I.e., lm t βt = 0 a.s. (50 Based on the defnton of β t n (48, the lmt n (50 s true f one of the followng events holds: The ndcator functon s null after for large t. The lmt lm t (F (x t F (x γmk/4m = 0 holds true. From any of these two events we t s mpled that lm nf F t (xt F (x γmk a.s. (5 4m Therefore, the clam n (4 s vald. The result n (5 shows the objectve functon value sequence F (x t almost sure converges to a neghborhood of the optmal objectve functon value F (x. We proceed to prove the result n (5. Compute the expected value of (0 gven F 0 and set γ t = γ to obtan E [ F (x t+ F (x ] ( mγr E [ F (x t F (x ] + rmkγ. (5 Notce that the expresson n (5 provdes an upper bound for the expected value of objectve functon error E [ F (x t+ F (x ] n terms of ts prevous value E [F (x t F (x ] and an error term. Rewrtng the relaton n (5 for step t leads to E [ F (x t F (x ] ( mγr E [ F (x t F (x ] + rmkγ. (53 Substtutng the upper bound n (53 for the expectaton E [F (x t F (x ] n (5 follows an upper bound for the expected error E [ F (x t+ F (x ] as E [ F (x t+ F (x ] ( mγr E [ F (x t F (x ] + rmkγ ( + ( mrγ. (54 By recursvely applyng the steps n (53 and (54 we can bound the expected objectve functon error E [ F (x t+ F (x ] n terms of the ntal objectve functon error F (x 0 F (x and the accumulaton of the errors as E [ F (x t+ F (x ] ( mγr t+ (F (x 0 F (x + rmkγ Substtutng t by t and smplfyng the sum n the rght hand sde of (55 yelds E [ F (x t F (x ] ( mγr t (F (x 0 F (x + MKγ 4m t ( mrγ u. (55 u=0 [ ( mrγ t]. (56 Observng that the term ( mrγ t n the rght hand sde of (56 s strctly smaller than for the stepsze γ < /(mr, the clam n (5 follows. 3

14 E Implementaton of ARAPS For reference, ARAPSA s also summarzed n algorthmc form n Algorthm. Steps and 3 are devoted to assgnng random blocks to the processors. In Step a subset of avalable blocks I t s chosen. These blocks are assgned to dfferent processors n Step 3. In Step 5 processors compute the partal stochastc gradent correspondng to ther assgned blocks x f(x t, Θ t usng the acqured samples n Step 4. Steps 6 and 7 are devoted to the computaton of the ARAPSA descent drecton ˆd t t,0. In Step 6 the approxmate Hessan nverse ˆB for block x s ntalzed as ˆB t,0 = η ti whch s a scaled dentty matrx usng the expresson for ηt n (0 for t > 0. The ntal value of η t s η0 =. In Step 7 we use Algorthm for effcent computaton of the descent drecton ˆd t = ˆB t x f(x t, Θ t. The descent drecton ˆd t s used to update the block xt wth stepsze γt n Step 8. Step 9 determnes the value of the partal stochastc gradent x f(x t+, Θ t whch s requred for the computaton of stochastc gradent varaton ˆr t. In Step 0 the varable varaton v t and stochastc gradent varaton ˆrt assocated wth block x are computed to be used n the next teraton. Algorthm Computaton of the ARAPSA step ˆd t = ˆB t x f(x t, Θ t for block x. : functon ˆd ( t = qτ = ARAPSA Step ˆBt,0, p 0 = x f(x t, Θ t, {v u, ˆru }t u=t τ : for u = 0,,..., τ do {Loop to compute constants α u and sequence p u } 3: Compute and store scalar α u = ˆρ t u (v t u T p u 4: Update sequence vector p u+ = p u t u αuˆr. 5: end for 6: Multply p τ by ntal matrx: q 0 t,0 = ˆB p τ 7: for u = 0,,..., τ do {Loop to compute constants β u and sequence q u } 8: Compute scalar β u = ˆρ t τ+u (ˆr t τ+u T q u 9: Update sequence vector q u+ = q u + (α τ u β u v t τ+u 0: end for {return ˆd t = qτ } Algorthm Accelerated Random Parallel Stochastc Algorthm (ARAPSA for ndvdual processors : for t = 0,,,... do : Choose unformly at random set I t {,..., B} of block varables to update 3: Assgn block varables S t to processors n any manner. 4: Choose a set of realzatons Θ t for the block x 5: Compute stochastc gradent : x f(x t, Θ t = x f(x t, θ [cf. (3] L 6: Compute the ntal Hessan nverse approxmaton: 7: Compute descent drecton: ˆd t = olbfgs Step θ Θ t t,0 ˆB ( ˆBt,0 = η t I, x f(x t, Θ t, {v u, ˆr u } t u=t τ 8: Update the coordnates of the decson varable x t+ = x t γ t ˆdt 9: Compute updated stochastc gradent: x f(x t+, Θ t = x f(x t+, θ [cf. (3] L θ Θ t 0: Update varatons v t = x t+ x t and ˆr t = x f(x t+, Θ t x f(x t, Θ t [ cf.(9] : end for 4

Doubly Stochastic Algorithms for Large-Scale Optimization

Doubly Stochastic Algorithms for Large-Scale Optimization Douly Stochastc Algorthms for Large-Scale Optmzaton Alec Koppel, Aryan Mokhtar, and Alejandro Rero Astract We consder learnng prolems over tranng sets n whch oth, the numer of tranng examples and the dmenson

More information

MgtOp 215 Chapter 13 Dr. Ahn

MgtOp 215 Chapter 13 Dr. Ahn MgtOp 5 Chapter 3 Dr Ahn Consder two random varables X and Y wth,,, In order to study the relatonshp between the two random varables, we need a numercal measure that descrbes the relatonshp The covarance

More information

Tests for Two Correlations

Tests for Two Correlations PASS Sample Sze Software Chapter 805 Tests for Two Correlatons Introducton The correlaton coeffcent (or correlaton), ρ, s a popular parameter for descrbng the strength of the assocaton between two varables.

More information

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #21 Scribe: Lawrence Diao April 23, 2013

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #21 Scribe: Lawrence Diao April 23, 2013 COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #21 Scrbe: Lawrence Dao Aprl 23, 2013 1 On-Lne Log Loss To recap the end of the last lecture, we have the followng on-lne problem wth N

More information

Scribe: Chris Berlind Date: Feb 1, 2010

Scribe: Chris Berlind Date: Feb 1, 2010 CS/CNS/EE 253: Advanced Topcs n Machne Learnng Topc: Dealng wth Partal Feedback #2 Lecturer: Danel Golovn Scrbe: Chrs Berlnd Date: Feb 1, 2010 8.1 Revew In the prevous lecture we began lookng at algorthms

More information

2.1 Rademacher Calculus... 3

2.1 Rademacher Calculus... 3 COS 598E: Unsupervsed Learnng Week 2 Lecturer: Elad Hazan Scrbe: Kran Vodrahall Contents 1 Introducton 1 2 Non-generatve pproach 1 2.1 Rademacher Calculus............................... 3 3 Spectral utoencoders

More information

Appendix for Solving Asset Pricing Models when the Price-Dividend Function is Analytic

Appendix for Solving Asset Pricing Models when the Price-Dividend Function is Analytic Appendx for Solvng Asset Prcng Models when the Prce-Dvdend Functon s Analytc Ovdu L. Caln Yu Chen Thomas F. Cosmano and Alex A. Hmonas January 3, 5 Ths appendx provdes proofs of some results stated n our

More information

/ Computational Genomics. Normalization

/ Computational Genomics. Normalization 0-80 /02-70 Computatonal Genomcs Normalzaton Gene Expresson Analyss Model Computatonal nformaton fuson Bologcal regulatory networks Pattern Recognton Data Analyss clusterng, classfcaton normalzaton, mss.

More information

Economic Design of Short-Run CSP-1 Plan Under Linear Inspection Cost

Economic Design of Short-Run CSP-1 Plan Under Linear Inspection Cost Tamkang Journal of Scence and Engneerng, Vol. 9, No 1, pp. 19 23 (2006) 19 Economc Desgn of Short-Run CSP-1 Plan Under Lnear Inspecton Cost Chung-Ho Chen 1 * and Chao-Yu Chou 2 1 Department of Industral

More information

15-451/651: Design & Analysis of Algorithms January 22, 2019 Lecture #3: Amortized Analysis last changed: January 18, 2019

15-451/651: Design & Analysis of Algorithms January 22, 2019 Lecture #3: Amortized Analysis last changed: January 18, 2019 5-45/65: Desgn & Analyss of Algorthms January, 09 Lecture #3: Amortzed Analyss last changed: January 8, 09 Introducton In ths lecture we dscuss a useful form of analyss, called amortzed analyss, for problems

More information

Price and Quantity Competition Revisited. Abstract

Price and Quantity Competition Revisited. Abstract rce and uantty Competton Revsted X. Henry Wang Unversty of Mssour - Columba Abstract By enlargng the parameter space orgnally consdered by Sngh and Vves (984 to allow for a wder range of cost asymmetry,

More information

Quiz on Deterministic part of course October 22, 2002

Quiz on Deterministic part of course October 22, 2002 Engneerng ystems Analyss for Desgn Quz on Determnstc part of course October 22, 2002 Ths s a closed book exercse. You may use calculators Grade Tables There are 90 ponts possble for the regular test, or

More information

Maximum Likelihood Estimation of Isotonic Normal Means with Unknown Variances*

Maximum Likelihood Estimation of Isotonic Normal Means with Unknown Variances* Journal of Multvarate Analyss 64, 183195 (1998) Artcle No. MV971717 Maxmum Lelhood Estmaton of Isotonc Normal Means wth Unnown Varances* Nng-Zhong Sh and Hua Jang Northeast Normal Unversty, Changchun,Chna

More information

ECE 586GT: Problem Set 2: Problems and Solutions Uniqueness of Nash equilibria, zero sum games, evolutionary dynamics

ECE 586GT: Problem Set 2: Problems and Solutions Uniqueness of Nash equilibria, zero sum games, evolutionary dynamics Unversty of Illnos Fall 08 ECE 586GT: Problem Set : Problems and Solutons Unqueness of Nash equlbra, zero sum games, evolutonary dynamcs Due: Tuesday, Sept. 5, at begnnng of class Readng: Course notes,

More information

Tests for Two Ordered Categorical Variables

Tests for Two Ordered Categorical Variables Chapter 253 Tests for Two Ordered Categorcal Varables Introducton Ths module computes power and sample sze for tests of ordered categorcal data such as Lkert scale data. Assumng proportonal odds, such

More information

Parallel Prefix addition

Parallel Prefix addition Marcelo Kryger Sudent ID 015629850 Parallel Prefx addton The parallel prefx adder presented next, performs the addton of two bnary numbers n tme of complexty O(log n) and lnear cost O(n). Lets notce the

More information

TCOM501 Networking: Theory & Fundamentals Final Examination Professor Yannis A. Korilis April 26, 2002

TCOM501 Networking: Theory & Fundamentals Final Examination Professor Yannis A. Korilis April 26, 2002 TO5 Networng: Theory & undamentals nal xamnaton Professor Yanns. orls prl, Problem [ ponts]: onsder a rng networ wth nodes,,,. In ths networ, a customer that completes servce at node exts the networ wth

More information

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE)

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE) ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE) May 17, 2016 15:30 Frst famly name: Name: DNI/ID: Moble: Second famly Name: GECO/GADE: Instructor: E-mal: Queston 1 A B C Blank Queston 2 A B C Blank Queston

More information

EDC Introduction

EDC Introduction .0 Introducton EDC3 In the last set of notes (EDC), we saw how to use penalty factors n solvng the EDC problem wth losses. In ths set of notes, we want to address two closely related ssues. What are, exactly,

More information

Appendix - Normally Distributed Admissible Choices are Optimal

Appendix - Normally Distributed Admissible Choices are Optimal Appendx - Normally Dstrbuted Admssble Choces are Optmal James N. Bodurtha, Jr. McDonough School of Busness Georgetown Unversty and Q Shen Stafford Partners Aprl 994 latest revson September 00 Abstract

More information

3: Central Limit Theorem, Systematic Errors

3: Central Limit Theorem, Systematic Errors 3: Central Lmt Theorem, Systematc Errors 1 Errors 1.1 Central Lmt Theorem Ths theorem s of prme mportance when measurng physcal quanttes because usually the mperfectons n the measurements are due to several

More information

Foundations of Machine Learning II TP1: Entropy

Foundations of Machine Learning II TP1: Entropy Foundatons of Machne Learnng II TP1: Entropy Gullaume Charpat (Teacher) & Gaétan Marceau Caron (Scrbe) Problem 1 (Gbbs nequalty). Let p and q two probablty measures over a fnte alphabet X. Prove that KL(p

More information

Equilibrium in Prediction Markets with Buyers and Sellers

Equilibrium in Prediction Markets with Buyers and Sellers Equlbrum n Predcton Markets wth Buyers and Sellers Shpra Agrawal Nmrod Megddo Benamn Armbruster Abstract Predcton markets wth buyers and sellers of contracts on multple outcomes are shown to have unque

More information

Fast Laplacian Solvers by Sparsification

Fast Laplacian Solvers by Sparsification Spectral Graph Theory Lecture 19 Fast Laplacan Solvers by Sparsfcaton Danel A. Spelman November 9, 2015 Dsclamer These notes are not necessarly an accurate representaton of what happened n class. The notes

More information

Supplementary material for Non-conjugate Variational Message Passing for Multinomial and Binary Regression

Supplementary material for Non-conjugate Variational Message Passing for Multinomial and Binary Regression Supplementary materal for Non-conjugate Varatonal Message Passng for Multnomal and Bnary Regresson October 9, 011 1 Alternatve dervaton We wll focus on a partcular factor f a and varable x, wth the am

More information

CS 286r: Matching and Market Design Lecture 2 Combinatorial Markets, Walrasian Equilibrium, Tâtonnement

CS 286r: Matching and Market Design Lecture 2 Combinatorial Markets, Walrasian Equilibrium, Tâtonnement CS 286r: Matchng and Market Desgn Lecture 2 Combnatoral Markets, Walrasan Equlbrum, Tâtonnement Matchng and Money Recall: Last tme we descrbed the Hungaran Method for computng a maxmumweght bpartte matchng.

More information

Consumption Based Asset Pricing

Consumption Based Asset Pricing Consumpton Based Asset Prcng Mchael Bar Aprl 25, 208 Contents Introducton 2 Model 2. Prcng rsk-free asset............................... 3 2.2 Prcng rsky assets................................ 4 2.3 Bubbles......................................

More information

4.4 Doob s inequalities

4.4 Doob s inequalities 34 CHAPTER 4. MARTINGALES 4.4 Doob s nequaltes The frst nterestng consequences of the optonal stoppng theorems are Doob s nequaltes. If M n s a martngale, denote M n =max applen M. Theorem 4.8 If M n s

More information

II. Random Variables. Variable Types. Variables Map Outcomes to Numbers

II. Random Variables. Variable Types. Variables Map Outcomes to Numbers II. Random Varables Random varables operate n much the same way as the outcomes or events n some arbtrary sample space the dstncton s that random varables are smply outcomes that are represented numercally.

More information

OPERATIONS RESEARCH. Game Theory

OPERATIONS RESEARCH. Game Theory OPERATIONS RESEARCH Chapter 2 Game Theory Prof. Bbhas C. Gr Department of Mathematcs Jadavpur Unversty Kolkata, Inda Emal: bcgr.umath@gmal.com 1.0 Introducton Game theory was developed for decson makng

More information

Multifactor Term Structure Models

Multifactor Term Structure Models 1 Multfactor Term Structure Models A. Lmtatons of One-Factor Models 1. Returns on bonds of all maturtes are perfectly correlated. 2. Term structure (and prces of every other dervatves) are unquely determned

More information

Applications of Myerson s Lemma

Applications of Myerson s Lemma Applcatons of Myerson s Lemma Professor Greenwald 28-2-7 We apply Myerson s lemma to solve the sngle-good aucton, and the generalzaton n whch there are k dentcal copes of the good. Our objectve s welfare

More information

Cyclic Scheduling in a Job shop with Multiple Assembly Firms

Cyclic Scheduling in a Job shop with Multiple Assembly Firms Proceedngs of the 0 Internatonal Conference on Industral Engneerng and Operatons Management Kuala Lumpur, Malaysa, January 4, 0 Cyclc Schedulng n a Job shop wth Multple Assembly Frms Tetsuya Kana and Koch

More information

Analysis of Variance and Design of Experiments-II

Analysis of Variance and Design of Experiments-II Analyss of Varance and Desgn of Experments-II MODULE VI LECTURE - 4 SPLIT-PLOT AND STRIP-PLOT DESIGNS Dr. Shalabh Department of Mathematcs & Statstcs Indan Insttute of Technology Kanpur An example to motvate

More information

Discrete Dynamic Shortest Path Problems in Transportation Applications

Discrete Dynamic Shortest Path Problems in Transportation Applications 17 Paper No. 98-115 TRANSPORTATION RESEARCH RECORD 1645 Dscrete Dynamc Shortest Path Problems n Transportaton Applcatons Complexty and Algorthms wth Optmal Run Tme ISMAIL CHABINI A soluton s provded for

More information

ISyE 512 Chapter 9. CUSUM and EWMA Control Charts. Instructor: Prof. Kaibo Liu. Department of Industrial and Systems Engineering UW-Madison

ISyE 512 Chapter 9. CUSUM and EWMA Control Charts. Instructor: Prof. Kaibo Liu. Department of Industrial and Systems Engineering UW-Madison ISyE 512 hapter 9 USUM and EWMA ontrol harts Instructor: Prof. Kabo Lu Department of Industral and Systems Engneerng UW-Madson Emal: klu8@wsc.edu Offce: Room 317 (Mechancal Engneerng Buldng) ISyE 512 Instructor:

More information

The convolution computation for Perfectly Matched Boundary Layer algorithm in finite differences

The convolution computation for Perfectly Matched Boundary Layer algorithm in finite differences The convoluton computaton for Perfectly Matched Boundary Layer algorthm n fnte dfferences Herman Jaramllo May 10, 2016 1 Introducton Ths s an exercse to help on the understandng on some mportant ssues

More information

Understanding Annuities. Some Algebraic Terminology.

Understanding Annuities. Some Algebraic Terminology. Understandng Annutes Ma 162 Sprng 2010 Ma 162 Sprng 2010 March 22, 2010 Some Algebrac Termnology We recall some terms and calculatons from elementary algebra A fnte sequence of numbers s a functon of natural

More information

2) In the medium-run/long-run, a decrease in the budget deficit will produce:

2) In the medium-run/long-run, a decrease in the budget deficit will produce: 4.02 Quz 2 Solutons Fall 2004 Multple-Choce Questons ) Consder the wage-settng and prce-settng equatons we studed n class. Suppose the markup, µ, equals 0.25, and F(u,z) = -u. What s the natural rate of

More information

Games and Decisions. Part I: Basic Theorems. Contents. 1 Introduction. Jane Yuxin Wang. 1 Introduction 1. 2 Two-player Games 2

Games and Decisions. Part I: Basic Theorems. Contents. 1 Introduction. Jane Yuxin Wang. 1 Introduction 1. 2 Two-player Games 2 Games and Decsons Part I: Basc Theorems Jane Yuxn Wang Contents 1 Introducton 1 2 Two-player Games 2 2.1 Zero-sum Games................................ 3 2.1.1 Pure Strateges.............................

More information

Problem Set 6 Finance 1,

Problem Set 6 Finance 1, Carnege Mellon Unversty Graduate School of Industral Admnstraton Chrs Telmer Wnter 2006 Problem Set 6 Fnance, 47-720. (representatve agent constructon) Consder the followng two-perod, two-agent economy.

More information

- contrast so-called first-best outcome of Lindahl equilibrium with case of private provision through voluntary contributions of households

- contrast so-called first-best outcome of Lindahl equilibrium with case of private provision through voluntary contributions of households Prvate Provson - contrast so-called frst-best outcome of Lndahl equlbrum wth case of prvate provson through voluntary contrbutons of households - need to make an assumpton about how each household expects

More information

Introduction to PGMs: Discrete Variables. Sargur Srihari

Introduction to PGMs: Discrete Variables. Sargur Srihari Introducton to : Dscrete Varables Sargur srhar@cedar.buffalo.edu Topcs. What are graphcal models (or ) 2. Use of Engneerng and AI 3. Drectonalty n graphs 4. Bayesan Networks 5. Generatve Models and Samplng

More information

Economics 1410 Fall Section 7 Notes 1. Define the tax in a flexible way using T (z), where z is the income reported by the agent.

Economics 1410 Fall Section 7 Notes 1. Define the tax in a flexible way using T (z), where z is the income reported by the agent. Economcs 1410 Fall 2017 Harvard Unversty Yaan Al-Karableh Secton 7 Notes 1 I. The ncome taxaton problem Defne the tax n a flexble way usng T (), where s the ncome reported by the agent. Retenton functon:

More information

Robust Stochastic Lot-Sizing by Means of Histograms

Robust Stochastic Lot-Sizing by Means of Histograms Robust Stochastc Lot-Szng by Means of Hstograms Abstract Tradtonal approaches n nventory control frst estmate the demand dstrbuton among a predefned famly of dstrbutons based on data fttng of hstorcal

More information

Random Variables. b 2.

Random Variables. b 2. Random Varables Generally the object of an nvestgators nterest s not necessarly the acton n the sample space but rather some functon of t. Techncally a real valued functon or mappng whose doman s the sample

More information

CHAPTER 3: BAYESIAN DECISION THEORY

CHAPTER 3: BAYESIAN DECISION THEORY CHATER 3: BAYESIAN DECISION THEORY Decson makng under uncertanty 3 rogrammng computers to make nference from data requres nterdscplnary knowledge from statstcs and computer scence Knowledge of statstcs

More information

A MODEL OF COMPETITION AMONG TELECOMMUNICATION SERVICE PROVIDERS BASED ON REPEATED GAME

A MODEL OF COMPETITION AMONG TELECOMMUNICATION SERVICE PROVIDERS BASED ON REPEATED GAME A MODEL OF COMPETITION AMONG TELECOMMUNICATION SERVICE PROVIDERS BASED ON REPEATED GAME Vesna Radonć Đogatovć, Valentna Radočć Unversty of Belgrade Faculty of Transport and Traffc Engneerng Belgrade, Serba

More information

Lecture 7. We now use Brouwer s fixed point theorem to prove Nash s theorem.

Lecture 7. We now use Brouwer s fixed point theorem to prove Nash s theorem. Topcs on the Border of Economcs and Computaton December 11, 2005 Lecturer: Noam Nsan Lecture 7 Scrbe: Yoram Bachrach 1 Nash s Theorem We begn by provng Nash s Theorem about the exstance of a mxed strategy

More information

Financial mathematics

Financial mathematics Fnancal mathematcs Jean-Luc Bouchot jean-luc.bouchot@drexel.edu February 19, 2013 Warnng Ths s a work n progress. I can not ensure t to be mstake free at the moment. It s also lackng some nformaton. But

More information

Linear Combinations of Random Variables and Sampling (100 points)

Linear Combinations of Random Variables and Sampling (100 points) Economcs 30330: Statstcs for Economcs Problem Set 6 Unversty of Notre Dame Instructor: Julo Garín Sprng 2012 Lnear Combnatons of Random Varables and Samplng 100 ponts 1. Four-part problem. Go get some

More information

Optimal Black-Box Reductions Between Optimization Objectives

Optimal Black-Box Reductions Between Optimization Objectives Optmal Black-Box Reductons Between Optmzaton Objectves Zeyuan Allen-Zhu zeyuan@csal.mt.edu Prnceton Unversty Elad Hazan ehazan@cs.prnceton.edu Prnceton Unversty arxv:163.56v3 [math.oc] May 16 frst crculated

More information

Tree-based and GA tools for optimal sampling design

Tree-based and GA tools for optimal sampling design Tree-based and GA tools for optmal samplng desgn The R User Conference 2008 August 2-4, Technsche Unverstät Dortmund, Germany Marco Balln, Gulo Barcarol Isttuto Nazonale d Statstca (ISTAT) Defnton of the

More information

PASS Sample Size Software. :log

PASS Sample Size Software. :log PASS Sample Sze Software Chapter 70 Probt Analyss Introducton Probt and lot analyss may be used for comparatve LD 50 studes for testn the effcacy of drus desned to prevent lethalty. Ths proram module presents

More information

CHAPTER 9 FUNCTIONAL FORMS OF REGRESSION MODELS

CHAPTER 9 FUNCTIONAL FORMS OF REGRESSION MODELS CHAPTER 9 FUNCTIONAL FORMS OF REGRESSION MODELS QUESTIONS 9.1. (a) In a log-log model the dependent and all explanatory varables are n the logarthmc form. (b) In the log-ln model the dependent varable

More information

Evaluating Performance

Evaluating Performance 5 Chapter Evaluatng Performance In Ths Chapter Dollar-Weghted Rate of Return Tme-Weghted Rate of Return Income Rate of Return Prncpal Rate of Return Daly Returns MPT Statstcs 5- Measurng Rates of Return

More information

Optimising a general repair kit problem with a service constraint

Optimising a general repair kit problem with a service constraint Optmsng a general repar kt problem wth a servce constrant Marco Bjvank 1, Ger Koole Department of Mathematcs, VU Unversty Amsterdam, De Boelelaan 1081a, 1081 HV Amsterdam, The Netherlands Irs F.A. Vs Department

More information

arxiv: v1 [math.nt] 29 Oct 2015

arxiv: v1 [math.nt] 29 Oct 2015 A DIGITAL BINOMIAL THEOREM FOR SHEFFER SEQUENCES TOUFIK MANSOUR AND HIEU D. NGUYEN arxv:1510.08529v1 [math.nt] 29 Oct 2015 Abstract. We extend the dgtal bnomal theorem to Sheffer polynomal sequences by

More information

Production and Supply Chain Management Logistics. Paolo Detti Department of Information Engeneering and Mathematical Sciences University of Siena

Production and Supply Chain Management Logistics. Paolo Detti Department of Information Engeneering and Mathematical Sciences University of Siena Producton and Supply Chan Management Logstcs Paolo Dett Department of Informaton Engeneerng and Mathematcal Scences Unversty of Sena Convergence and complexty of the algorthm Convergence of the algorthm

More information

Alternatives to Shewhart Charts

Alternatives to Shewhart Charts Alternatves to Shewhart Charts CUSUM & EWMA S Wongsa Overvew Revstng Shewhart Control Charts Cumulatve Sum (CUSUM) Control Chart Eponentally Weghted Movng Average (EWMA) Control Chart 2 Revstng Shewhart

More information

A Unified Distributed Algorithm for Non-Games Non-cooperative, Non-convex, and Non-differentiable. Jong-Shi Pang and Meisam Razaviyayn.

A Unified Distributed Algorithm for Non-Games Non-cooperative, Non-convex, and Non-differentiable. Jong-Shi Pang and Meisam Razaviyayn. A Unfed Dstrbuted Algorthm for Non-Games Non-cooperatve, Non-convex, and Non-dfferentable Jong-Sh Pang and Mesam Razavyayn presented at Workshop on Optmzaton for Modern Computaton Pekng Unversty, Bejng,

More information

Dynamic Analysis of Knowledge Sharing of Agents with. Heterogeneous Knowledge

Dynamic Analysis of Knowledge Sharing of Agents with. Heterogeneous Knowledge Dynamc Analyss of Sharng of Agents wth Heterogeneous Kazuyo Sato Akra Namatame Dept. of Computer Scence Natonal Defense Academy Yokosuka 39-8686 JAPAN E-mal {g40045 nama} @nda.ac.jp Abstract In ths paper

More information

Introduction to game theory

Introduction to game theory Introducton to game theory Lectures n game theory ECON5210, Sprng 2009, Part 1 17.12.2008 G.B. Ashem, ECON5210-1 1 Overvew over lectures 1. Introducton to game theory 2. Modelng nteractve knowledge; equlbrum

More information

Capability Analysis. Chapter 255. Introduction. Capability Analysis

Capability Analysis. Chapter 255. Introduction. Capability Analysis Chapter 55 Introducton Ths procedure summarzes the performance of a process based on user-specfed specfcaton lmts. The observed performance as well as the performance relatve to the Normal dstrbuton are

More information

Numerical Analysis ECIV 3306 Chapter 6

Numerical Analysis ECIV 3306 Chapter 6 The Islamc Unversty o Gaza Faculty o Engneerng Cvl Engneerng Department Numercal Analyss ECIV 3306 Chapter 6 Open Methods & System o Non-lnear Eqs Assocate Pro. Mazen Abualtaye Cvl Engneerng Department,

More information

A Case Study for Optimal Dynamic Simulation Allocation in Ordinal Optimization 1

A Case Study for Optimal Dynamic Simulation Allocation in Ordinal Optimization 1 A Case Study for Optmal Dynamc Smulaton Allocaton n Ordnal Optmzaton Chun-Hung Chen, Dongha He, and Mchael Fu 4 Abstract Ordnal Optmzaton has emerged as an effcent technque for smulaton and optmzaton.

More information

0.1 Gradient descent for convex functions: univariate case

0.1 Gradient descent for convex functions: univariate case prnceton unv. F 16 cos 51: Advanced Algorthm Desgn Lecture 14: Gong wth the slope: offlne, onlne, and randomly Lecturer: Sanjeev Arora Scrbe:Sanjeev Arora hs lecture s about gradent descent, a popular

More information

Bid-auction framework for microsimulation of location choice with endogenous real estate prices

Bid-auction framework for microsimulation of location choice with endogenous real estate prices Bd-aucton framework for mcrosmulaton of locaton choce wth endogenous real estate prces Rcardo Hurtuba Mchel Berlare Francsco Martínez Urbancs Termas de Chllán, Chle March 28 th 2012 Outlne 1) Motvaton

More information

SIMPLE FIXED-POINT ITERATION

SIMPLE FIXED-POINT ITERATION SIMPLE FIXED-POINT ITERATION The fed-pont teraton method s an open root fndng method. The method starts wth the equaton f ( The equaton s then rearranged so that one s one the left hand sde of the equaton

More information

Robust Boosting and its Relation to Bagging

Robust Boosting and its Relation to Bagging Robust Boostng and ts Relaton to Baggng Saharon Rosset IBM T.J. Watson Research Center P. O. Box 218 Yorktown Heghts, NY 10598 srosset@us.bm.com ABSTRACT Several authors have suggested vewng boostng as

More information

Available online at ScienceDirect. Procedia Computer Science 24 (2013 ) 9 14

Available online at   ScienceDirect. Procedia Computer Science 24 (2013 ) 9 14 Avalable onlne at www.scencedrect.com ScenceDrect Proceda Computer Scence 24 (2013 ) 9 14 17th Asa Pacfc Symposum on Intellgent and Evolutonary Systems, IES2013 A Proposal of Real-Tme Schedulng Algorthm

More information

Nonlinear Monte Carlo Methods. From American Options to Fully Nonlinear PDEs

Nonlinear Monte Carlo Methods. From American Options to Fully Nonlinear PDEs : From Amercan Optons to Fully Nonlnear PDEs Ecole Polytechnque Pars PDEs and Fnance Workshop KTH, Stockholm, August 20-23, 2007 Outlne 1 Monte Carlo Methods for Amercan Optons 2 3 4 Outlne 1 Monte Carlo

More information

Note on Cubic Spline Valuation Methodology

Note on Cubic Spline Valuation Methodology Note on Cubc Splne Valuaton Methodology Regd. Offce: The Internatonal, 2 nd Floor THE CUBIC SPLINE METHODOLOGY A model for yeld curve takes traded yelds for avalable tenors as nput and generates the curve

More information

Finance 402: Problem Set 1 Solutions

Finance 402: Problem Set 1 Solutions Fnance 402: Problem Set 1 Solutons Note: Where approprate, the fnal answer for each problem s gven n bold talcs for those not nterested n the dscusson of the soluton. 1. The annual coupon rate s 6%. A

More information

c slope = -(1+i)/(1+π 2 ) MRS (between consumption in consecutive time periods) price ratio (across consecutive time periods)

c slope = -(1+i)/(1+π 2 ) MRS (between consumption in consecutive time periods) price ratio (across consecutive time periods) CONSUMPTION-SAVINGS FRAMEWORK (CONTINUED) SEPTEMBER 24, 2013 The Graphcs of the Consumpton-Savngs Model CONSUMER OPTIMIZATION Consumer s decson problem: maxmze lfetme utlty subject to lfetme budget constrant

More information

Likelihood Fits. Craig Blocker Brandeis August 23, 2004

Likelihood Fits. Craig Blocker Brandeis August 23, 2004 Lkelhood Fts Crag Blocker Brandes August 23, 2004 Outlne I. What s the queston? II. Lkelhood Bascs III. Mathematcal Propertes IV. Uncertantes on Parameters V. Mscellaneous VI. Goodness of Ft VII. Comparson

More information

OCR Statistics 1 Working with data. Section 2: Measures of location

OCR Statistics 1 Working with data. Section 2: Measures of location OCR Statstcs 1 Workng wth data Secton 2: Measures of locaton Notes and Examples These notes have sub-sectons on: The medan Estmatng the medan from grouped data The mean Estmatng the mean from grouped data

More information

A Constant-Factor Approximation Algorithm for Network Revenue Management

A Constant-Factor Approximation Algorithm for Network Revenue Management A Constant-Factor Approxmaton Algorthm for Networ Revenue Management Yuhang Ma 1, Paat Rusmevchentong 2, Ma Sumda 1, Huseyn Topaloglu 1 1 School of Operatons Research and Informaton Engneerng, Cornell

More information

Topics on the Border of Economics and Computation November 6, Lecture 2

Topics on the Border of Economics and Computation November 6, Lecture 2 Topcs on the Border of Economcs and Computaton November 6, 2005 Lecturer: Noam Nsan Lecture 2 Scrbe: Arel Procacca 1 Introducton Last week we dscussed the bascs of zero-sum games n strategc form. We characterzed

More information

>1 indicates country i has a comparative advantage in production of j; the greater the index, the stronger the advantage. RCA 1 ij

>1 indicates country i has a comparative advantage in production of j; the greater the index, the stronger the advantage. RCA 1 ij 69 APPENDIX 1 RCA Indces In the followng we present some maor RCA ndces reported n the lterature. For addtonal varants and other RCA ndces, Memedovc (1994) and Vollrath (1991) provde more thorough revews.

More information

Elton, Gruber, Brown, and Goetzmann. Modern Portfolio Theory and Investment Analysis, 7th Edition. Solutions to Text Problems: Chapter 9

Elton, Gruber, Brown, and Goetzmann. Modern Portfolio Theory and Investment Analysis, 7th Edition. Solutions to Text Problems: Chapter 9 Elton, Gruber, Brown, and Goetzmann Modern Portfolo Theory and Investment Analyss, 7th Edton Solutons to Text Problems: Chapter 9 Chapter 9: Problem In the table below, gven that the rskless rate equals

More information

Interval Estimation for a Linear Function of. Variances of Nonnormal Distributions. that Utilize the Kurtosis

Interval Estimation for a Linear Function of. Variances of Nonnormal Distributions. that Utilize the Kurtosis Appled Mathematcal Scences, Vol. 7, 013, no. 99, 4909-4918 HIKARI Ltd, www.m-hkar.com http://dx.do.org/10.1988/ams.013.37366 Interval Estmaton for a Lnear Functon of Varances of Nonnormal Dstrbutons that

More information

Parsing beyond context-free grammar: Tree Adjoining Grammar Parsing I

Parsing beyond context-free grammar: Tree Adjoining Grammar Parsing I Parsng beyond context-free grammar: Tree donng Grammar Parsng I Laura Kallmeyer, Wolfgang Maer ommersemester 2009 duncton and substtuton (1) Tree donng Grammars (TG) Josh et al. (1975), Josh & chabes (1997):

More information

Teaching Note on Factor Model with a View --- A tutorial. This version: May 15, Prepared by Zhi Da *

Teaching Note on Factor Model with a View --- A tutorial. This version: May 15, Prepared by Zhi Da * Copyrght by Zh Da and Rav Jagannathan Teachng Note on For Model th a Ve --- A tutoral Ths verson: May 5, 2005 Prepared by Zh Da * Ths tutoral demonstrates ho to ncorporate economc ves n optmal asset allocaton

More information

Testing for Omitted Variables

Testing for Omitted Variables Testng for Omtted Varables Jeroen Weese Department of Socology Unversty of Utrecht The Netherlands emal J.weese@fss.uu.nl tel +31 30 2531922 fax+31 30 2534405 Prepared for North Amercan Stata users meetng

More information

Elements of Economic Analysis II Lecture VI: Industry Supply

Elements of Economic Analysis II Lecture VI: Industry Supply Elements of Economc Analyss II Lecture VI: Industry Supply Ka Hao Yang 10/12/2017 In the prevous lecture, we analyzed the frm s supply decson usng a set of smple graphcal analyses. In fact, the dscusson

More information

A Bootstrap Confidence Limit for Process Capability Indices

A Bootstrap Confidence Limit for Process Capability Indices A ootstrap Confdence Lmt for Process Capablty Indces YANG Janfeng School of usness, Zhengzhou Unversty, P.R.Chna, 450001 Abstract The process capablty ndces are wdely used by qualty professonals as an

More information

Centre for International Capital Markets

Centre for International Capital Markets Centre for Internatonal Captal Markets Dscusson Papers ISSN 1749-3412 Valung Amercan Style Dervatves by Least Squares Methods Maro Cerrato No 2007-13 Valung Amercan Style Dervatves by Least Squares Methods

More information

AC : THE DIAGRAMMATIC AND MATHEMATICAL APPROACH OF PROJECT TIME-COST TRADEOFFS

AC : THE DIAGRAMMATIC AND MATHEMATICAL APPROACH OF PROJECT TIME-COST TRADEOFFS AC 2008-1635: THE DIAGRAMMATIC AND MATHEMATICAL APPROACH OF PROJECT TIME-COST TRADEOFFS Kun-jung Hsu, Leader Unversty Amercan Socety for Engneerng Educaton, 2008 Page 13.1217.1 Ttle of the Paper: The Dagrammatc

More information

Project Management Project Phases the S curve

Project Management Project Phases the S curve Project lfe cycle and resource usage Phases Project Management Project Phases the S curve Eng. Gorgo Locatell RATE OF RESOURCE ES Conceptual Defnton Realzaton Release TIME Cumulated resource usage and

More information

A DUAL EXTERIOR POINT SIMPLEX TYPE ALGORITHM FOR THE MINIMUM COST NETWORK FLOW PROBLEM

A DUAL EXTERIOR POINT SIMPLEX TYPE ALGORITHM FOR THE MINIMUM COST NETWORK FLOW PROBLEM Yugoslav Journal of Operatons Research Vol 19 (2009), Number 1, 157-170 DOI:10.2298/YUJOR0901157G A DUAL EXTERIOR POINT SIMPLEX TYPE ALGORITHM FOR THE MINIMUM COST NETWORK FLOW PROBLEM George GERANIS Konstantnos

More information

Taxation and Externalities. - Much recent discussion of policy towards externalities, e.g., global warming debate/kyoto

Taxation and Externalities. - Much recent discussion of policy towards externalities, e.g., global warming debate/kyoto Taxaton and Externaltes - Much recent dscusson of polcy towards externaltes, e.g., global warmng debate/kyoto - Increasng share of tax revenue from envronmental taxaton 6 percent n OECD - Envronmental

More information

3/3/2014. CDS M Phil Econometrics. Vijayamohanan Pillai N. Truncated standard normal distribution for a = 0.5, 0, and 0.5. CDS Mphil Econometrics

3/3/2014. CDS M Phil Econometrics. Vijayamohanan Pillai N. Truncated standard normal distribution for a = 0.5, 0, and 0.5. CDS Mphil Econometrics Lmted Dependent Varable Models: Tobt an Plla N 1 CDS Mphl Econometrcs Introducton Lmted Dependent Varable Models: Truncaton and Censorng Maddala, G. 1983. Lmted Dependent and Qualtatve Varables n Econometrcs.

More information

Introduction. Why One-Pass Statistics?

Introduction. Why One-Pass Statistics? BERKELE RESEARCH GROUP Ths manuscrpt s program documentaton for three ways to calculate the mean, varance, skewness, kurtoss, covarance, correlaton, regresson parameters and other regresson statstcs. Although

More information

Mode is the value which occurs most frequency. The mode may not exist, and even if it does, it may not be unique.

Mode is the value which occurs most frequency. The mode may not exist, and even if it does, it may not be unique. 1.7.4 Mode Mode s the value whch occurs most frequency. The mode may not exst, and even f t does, t may not be unque. For ungrouped data, we smply count the largest frequency of the gven value. If all

More information

Solution of periodic review inventory model with general constrains

Solution of periodic review inventory model with general constrains Soluton of perodc revew nventory model wth general constrans Soluton of perodc revew nventory model wth general constrans Prof Dr J Benkő SZIU Gödöllő Summary Reasons for presence of nventory (stock of

More information

Chapter 5 Student Lecture Notes 5-1

Chapter 5 Student Lecture Notes 5-1 Chapter 5 Student Lecture Notes 5-1 Basc Busness Statstcs (9 th Edton) Chapter 5 Some Important Dscrete Probablty Dstrbutons 004 Prentce-Hall, Inc. Chap 5-1 Chapter Topcs The Probablty Dstrbuton of a Dscrete

More information

Blocks of Coordinates, Stoch Progr, and Markets

Blocks of Coordinates, Stoch Progr, and Markets Blocks of Coordnates, Stoch Progr, and Markets Sjur Ddrk Flåm Dep. Informatcs, Unversty of Bergen, Norway Bergamo June 2017 jur Ddrk Flåm (Dep. Informatcs, Unversty Blocks of Bergen, of Coordnates, Norway)

More information

Cumulative Step-size Adaptation on Linear Functions

Cumulative Step-size Adaptation on Linear Functions Cumulatve Step-sze Adaptaton on Lnear Functons Alexandre Chotard, Anne Auger, Nkolaus Hansen To cte ths verson: Alexandre Chotard, Anne Auger, Nkolaus Hansen. Cumulatve Step-sze Adaptaton on Lnear Functons.

More information

Hedging Greeks for a portfolio of options using linear and quadratic programming

Hedging Greeks for a portfolio of options using linear and quadratic programming MPRA Munch Personal RePEc Archve Hedgng reeks for a of otons usng lnear and quadratc rogrammng Panka Snha and Archt Johar Faculty of Management Studes, Unversty of elh, elh 5. February 200 Onlne at htt://mra.ub.un-muenchen.de/20834/

More information