Fast Laplacian Solvers by Sparsification

Spectral Graph Theory Lecture 19 Fast Laplacan Solvers by Sparsfcaton Danel A. Spelman November 9, 2015 Dsclamer These notes are not necessarly an accurate representaton of what happened n class. The notes wrtten before class say what I thnk I should say. I sometmes edt the notes after class to make them way what I wsh I had sad. There may be small mstakes, so I recommend that you check any mathematcally precse statement before usng t n your own work. These notes were last revsed on November 9, 2015. 19.1 Overvew We wll see how sparsfcaton allows us to solve systems of lnear equatons n Laplacan matrces and ther sub-matrces n nearly lnear tme. By nearly-lnear, I mean tme O(m log c (nκ 1 ) log ɛ 1 ) for systems wth m nonzero entres, n dmensons, condton number κ. and accuracy ɛ. Ths algorthm comes from [PS14]. 19.2 Today s noton of approxmaton In today s lecture, I wll fnd t convenent to defne matrx approxmatons slghtly dfferently from prevous lectures. Today, I defne A ɛ B to mean e ɛ A B e ɛ A. Note that ths relaton s symmetrc n A and B, and that for ɛ small e ɛ 1 + ɛ. The advantage of ths defnton s that A α B and B β C mples A α+β C. 19.3 The Idea I begn by descrbng the dea behnd the algorthm. Ths dea won t qute work. But, we wll see how to turn t nto one that does. 19-1

Lecture 19: November 9, 2015 19-2 We wll work wth matrces that look lke M = L + X where L s a Laplacan and X s a non-zero, non-negatve dagonal matrx. Such matrces are called M-matrces. A symmetrc M-matrx s a matrx M wth nonpostve off-dagonal entres such that M 1 s nonnegatve and nonzero. We have encountered M-matrces before wthout namng them. If G = (V, E) s a graph, S V, and G(S) s connected, then the submatrx of L G ndexed by rows and columns n S s an M- matrx. Algorthmcally, the problems of solvng systems of equatons n Laplacans and symmetrc M-matrces are equvalent. The sparsfcaton results that we learned for Laplacans translate over to M-matrces. Every M- matrx M can be wrtten n the form X + L where L s a Laplacan and X s a nonnegatve dagonal matrx. If L ɛ L, then t s easy to show (too easy for homework) that X + L ɛ X + L. In Lecture 7, Lemma 7.3.1, we proved that f X has at least one nonzero entry and f L s connected, then X + L s nonsngular. We wrte such a matrx n the form M = D A where D s postve dagonal and A s nonnegatve, and note that ts beng nonsngular and postve semdefnte mples Usng the Perron-Frobenus theorem, one can also show that D A 0 D A. (19.1) D A. (19.2) Multplyng M by D 1/2 on ether sde, we obtan Defne I D 1/2 AD 1/2. B = D 1/2 AD 1/2, and note that nequaltes (19.1) and (19.2) mply that all egenvalues of B have absolute value strctly less than 1. It suffces to fgure out how to solve systems of equatons n I B. One way to do ths s to explot the power seres expanson: (I B) 1 = I + B + B 2 + B 3 + However, ths seres mght need many terms to converge. We can fgure out how many. If the largest egenvalue of B s (1 κ) < 1, then we need at least 1/κ terms. We can wrte a seres wth fewer terms f we express t as a product nstead of as a sum: (I + B 2j ). 0 B = j 1 To see why ths works, look at the frst few terms (I +B)(I +B 2 )(I +B 4 ) = (I +B +B 2 +B 3 )(I +B 4 ) = (I +B +B 2 +B 3 )+B 4 (I +B +B 2 +B 3 ).

Lecture 19: November 9, 2015 19-3 We only need O(log κ 1 ) terms of ths product to obtan a good approxmaton of (I B) 1. The obstacle to quckly applyng a seres lke ths s that the matrces I + B 2j are probably dense. We know how to solve ths problem: we can sparsfy them! I m not sayng that flppantly. We actually do know how to sparfy matrces of ths form. But, smply sparsfyng the matrces I + B 2j does not solve our problem because approxmaton s not preserved by products. That s, even f A ɛ Â and B ɛ B, Â B could be a very poor approxmaton of AB. In fact, snce the product Â B s not necessarly symmetrc, we haven t even defned what t would mean for t to approxmate AB. 19.4 A symmetrc expanson We wll now derve a way of expandng (I B) 1 that s amenable to approxmaton. We begn wth an alternate dervaton of the seres we saw before. Note that and so Takng the nverse of both sdes gves (I B)(I + B) = (I B 2 ), (I B) = (I B 2 )(I + B) 1. (I B) 1 = (I + B)(I B 2 ) 1. We can then apply the same expanson to (I B 2 ) 1 to obtan (I B) 1 = (I + B)(I + B 2 )(I B 4 ) 1. What we need s a symmetrc expanson. We use (I B) 1 = 1 2 I + 1 2 (I + B)(I B 2 ) 1 (I + B). (19.3) We wll verfy ths by multplyng the rght hand sde by (I B): (I + B)(I B 2 ) 1 (I + B)(I B) = (I + B)(I B 2 ) 1 (I B 2 ) = I + B; so 1 [ I + (I + B)(I B 2 ) 1 (I + B) ] (I B) = 1 [(I B) + (I + B)] = I. 2 2 Ths expresson for (I B) 1 plays ncely wth matrx approxmatons. If M 1 ɛ (I B 2 ), then you can show (I B) 1 1 [ ɛ I + (I + B)M 1 1 2 (I + B)]. If we can apply M 1 1 quckly and f B s sparse, then we can quckly approxmate (I B) 1. You may now be wonderng how we wll construct such an M 1. The answer, n short, s recursvely.

Lecture 19: November 9, 2015 19-4 19.5 D and A Unfortunately, we are gong to need to stop wrttng matrces n terms of I and B, and return to wrtng them n terms of D and A. The reason ths s unfortunate s that t makes for longer expressons. The analog of (19.3) s (D A) 1 = 1 2 [ D 1 + (I + D 1 A)(D AD 1 A) 1 (I + AD 1 ) ]. (19.4) In order to be able to work wth ths expresson nductvely, we need to check that the mddle matrx s an M-matrx. Lemma 19.5.1. If D s a dagonal matrx and A s a nonnegatve matrx so that M = D A s an M-matrx, then M 1 = D AD 1 A s also an M-matrx. Proof. As the off-dagonal entres of ths matrx are symmetrc and nonpostve, t suffces to prove that M 1 0 and M 1 0. To compute the row sums set d = D1 and a = A1, and note that d a 0 and d a 0. For M 1, we have whch s nonnegatve and not exactly zero. (D AD 1 A)1 = d AD 1 a d A1 = d a, We wll apply transformaton lke ths many tmes durng our algorthm. To keep track of progress, I say that (D, A) s an (α, β)-par f a. D s postve dagonal, b. A s nonnegatve (and can have dagonal entres), and c. αd A and βd A. For our ntal matrx M = D A, we know that there s some number κ > 0 for whch (D, A) s a (1 κ, 1 κ)-par. At the end of our recurson we wll seek a (1/4, 1/4)-par. When we have such a par, we can just approxmate D A by D. Lemma 19.5.2. If M = D A and (D, A) s a (1/4, 1/4)-par, then M 1/3 D.

Lecture 19: November 9, 2015 19-5 Proof. We have and M = D A (1 + 1/4)D e 1/4 D, M = D A D (1/4)D = (3/4)D e 1/3 D. Lemma 19.5.3. If (D, A) s an (α, α)-par, then (D, AD 1 A) s an (α 2, 0)-par. Proof. From Lecture 14, Lemma 3.1, we know that the condton of the lemma s equvalent to the asserton that all egenvalues of D 1 A have absolute value at most α, and that the concluson s equvalent to the asserton that all egenvalues of D 1 AD 1 A le between 0 and α 2, whch s mmedate as they are the squares of the egenvalues of D 1 A. So, f we start wth matrces D and A that are a (1 κ, 1 κ)-par, then after applyng ths transformaton approxmately log κ 1 + 2 tmes we obtan a (1/4, 0)-par. But, the matrces n ths par could be dense. To keep them sparse, we need to fgure out how approxmatng D A degrades ts qualty. Lemma 19.5.4. If ɛ 1/3, a. (D, A) s a (1 κ, 0) par, b. D A ɛ D Â, and c. D ɛ D, then D Â s an (1 κe 2ɛ, 3ɛ)-par. Proof. Frst observe that Then, compute (1 κ)d A D A κd. D Â e ɛ (D A) e ɛ κd e 2ɛ κ D. For the other sde, compute e 2ɛ D e ɛ D e ɛ (D A) ( D Â). For ɛ 1/3, 3ɛ e 2ɛ 1, so 3ɛ D (e 2ɛ 1) D Â.

Lecture 19: November 9, 2015 19-6 It remans to confrm that sparsfcaton satsfes the requrements of ths lemma. The reason ths mght not be obvous s that we allow A to have nonnegatve dagonal elements. Whle ths does not nterfere wth condton b, you mght be concerned that t would nterfere wth condton c. It need not. Let C be the dagonal of A, and let L be the Laplacan of the graph wth adjacency matrx A C, and set X so that X + L = D A. Let L be a sparse ɛ-approxmaton of L. By computng the quadratc form n elementary unt vectors, you can check that the dagonals of L and L approxmate each other. If we now wrte L = D Ã, where Ã has zero dagonal, and set D = D + C and Â = Ã + C You can now check that D and Â satsfy the requrements of Lemma 19.5.4. You mght wonder why we bother to keep dagonal elements n a matrx lke A. It seems smpler to get rd of them. However, we want (D, A) to be an (α, β) par, and removng subtractng C from both of them would make β worse. Ths mght not matter too much as we have good control over β. But, I don t yet see a nce way to carry out a proof that explots ths. 19.6 Sketch of the constructon We begn wth an M-matrx M 0 = D 0 A 0. Snce ths matrx s nonsngular, there s a κ 0 > 0 so that (D 0, A 0 ) s a (1 κ 0, 1 κ 0 ) par. We now know that the matrx D 0 A 0 D 1 0 A 0 s an M-matrx and that (D 0, A 0 D 1 0 A 0) s a ((1 κ 0 ) 2, 0)-par. Defne κ 1 so that 1 κ 1 = (1 κ) 2 0, and note that κ 1 s approxmately 2κ 0. Lemma 19.5.4 and the dscusson followng t tells us that there s a (1 κ 1 e 2ɛ, 3ɛ)-par (D 1, A 1 ) so that and so that A 1 has O(n/ɛ 2 ) nonzero entres. D 1 A 1 ɛ D 0 A 0 D 1 0 A 0 Contnung nductvely for some number k steps, we fnd (1 κ, 3ɛ) pars (D, A ) so that has O(n/ɛ 2 ) nonzero entres, and M = D A M ɛ D A 1 D 1 1 A 1. For the such that κ s small, κ +1 s approxmately twce κ. So, for k = 2 + log 2 1/κ and ɛ close to zero, we can guarantee that (D k, A k ) s a (1/4, 1/4) par. We now see how ths constructon allows us to approxmately solve systems of equatons n D 0 A 0, and how we must set ɛ for t to work. For every 0 < k, we have (D A ) 1 1 2 D 1 + 1 2 (I +D 1 A )(D A D 1 A ) 1 (I +A D 1 1 ) ɛ 2 D 1 + 1 2 (I +D 1 A )(D +1 A +1 ) 1 (I +

Lecture 19: November 9, 2015 19-7 and (D k A k ) 1 1/3 D 1 k. By substtutng through each of these approxmatons, we obtan solutons to systems of equatons n D 0 A 0 wth accuracy 1/3 + kɛ. So, we should set kɛ = 1/3, and thus ɛ = 1/(2 + log 2 κ 1 ). The domnant cost of the resultng algorthm wll be the multplcaton of vectors by 2k matrces of O(n/ɛ 2 ) entres, wth a total cost of O(n(log 2 (1/κ)) 3 ). 19.7 Makng the constructon effcent In the above constructon, I just assumed that approprate sparsfers exst, rather than constructng them effcently. To construct them effcently, we need two deas. The frst s that we need to be able to quckly approxmate effectve resstances so that we can use the samplng algorthm from Lecture 17. The second s to observe that we do not actually want to form the matrx AD 1 A before sparsfyng t, as that could take too long. Instead, we express t as a product of clques that have succnct descrptons, and we form the sum of approxmatons of each of those. 19.8 Improvements The fastest known algorthms for solvng systems of equatons run n tme O(m log n log ɛ 1 ) [CKM + 14]. The algorthm I have presented here can be substantally mproved by combnng t wth Cholesky factorzaton. Ths both gves an effcent parallel algorthm, and proves the exstence of an approxmate nverse for every M-matrx that has a lnear number of nonzeros [LPS15]. References [CKM + 14] Mchael B. Cohen, Rasmus Kyng, Gary L. Mller, Jakub W. Pachock, Rchard Peng, Anup B. Rao, and Shen Chen Xu. Solvng sdd lnear systems n nearly mlog1/2n tme. In Proceedngs of the 46th Annual ACM Symposum on Theory of Computng, STOC 14, pages 343 352, New York, NY, USA, 2014. ACM. [LPS15] Yn Tat Lee, Rchard Peng, and Danel A. Spelman. Sparsfed cholesky solvers for SDD lnear systems. CoRR, abs/1506.08204, 2015. [PS14] Rchard Peng and Danel A. Spelman. An effcent parallel solver for SDD lnear systems. In Symposum on Theory of Computng, STOC 2014, New York, NY, USA, May 31 - June 03, 2014, pages 333 342, 2014.