Supplement to Adaptive Estimation of High Dimensional Partially Linear Model

Supplemet to Adaptive Estimatio o High Dimesioal Partially Liear Model Fag Ha Zhao Re ad Yuxi Zhu May 6 017 This supplemetary material provides the techical proos as well as some auxiliary lemmas. For almost all proo subsectios i Sectio A we irst restate the target theorem or lemma with more explicit depedece amog all relevat costats ad the provide the details o its proo. A1 Additioal otatio We write B p = { x R p : x 1 ad S p 1 = { x R p : x = 1. Let e j R p be a vector that has 1 at the j-th positio ad 0 elsewhere. A Techical proos A.1 Proo o Theorem.1 Proo. By.1 we have θ h θ ρ. So it suices to show that θ h θ h 9 s λ /κ 1 holds with probability at least 1 ɛ 1 ɛ wheever λ κ 1 r/3 s 1/. We split the rest o the proo ito two mai steps. Step I. Deote = θ h θ h. Recall deiitio o sets S ad C S ad urther deie uctio F = Γ θ h + h Γ θ h h + λ θ h + 1 θ h 1. For the irst step we show that i F > 0 or all C S { R p : = η the η. To this ed we irst show that C S. A.1 Departmet o Statistics Uiversity o Washigto Seattle WA 98195 USA; e-mail: agha@uw.edu Departmet o Statistics Uiversity o Pittsburgh Pittsburgh PA 1560 USA; email: zre@pitt.edu Departmet o Biostatistics Johs Hopkis Uiversity Baltimore MD 105 USA; e-mail: yuzhu@jhsph.edu 1

Applyig triagle iequality ad some algebra we obtai θ h + 1 θ h 1 Sc 1 S 1. We also have with probability at least 1 ɛ 1 Γ θ h + h Γ θ h h Γ θ h h A. Γ θ h h 1 λ S 1 + Sc 1 A.3 where the irst iequality is by covexity o Γ θ h i θ as assumed i Assumptio 3 the secod is by Hölder s iequality ad the last is by Assumptio. Combiig A. ad A.3 ad usig the act that F 0 we have 0 λ Sc 1 3 S 1 thus provig A.1. Next we assume that > η. The because C S ad C S is star-shaped there exists some t 0 1 such that t C S { R p : = η. However by covexity o F Ft tf + 1 tf0 = tf 0. By cotradictio we complete the proo o the irst step. Step II. For the secod step we show that uder Assumptios 1-3 we have F > 0 or all C S { R p : = η or some appropriately chose η ad the complete the proo. Combiig Assumptios 3 ad A. or ay C S { R p : = η where we take η = 3 s 1/ λ /κ 1 ad λ κ 1 r/3 s 1/ so that η r we have that with probability at least 1 ɛ 1 ɛ F Γ θ h h + κ 1 + λ θ h + 1 θ h 1 Γ θ h h 1 + κ 1 + λ Sc 1 S 1 λ 1 / + κ 1 + λ Sc 1 S 1 κ 1 3λ s 1/ / where the irst iequality is by Assumptio 3 the secod is by Hölder s iequality ad A. the third is by Assumptio ad the last is due to the act that S 1 s 1/ S s 1/. The we have F κ 1 η 3 s 1/ λ η/ = 9 s λ /κ 1 > 0 which usig result rom Step I implies that η = 9 s λ /κ 1. Combiig with Assumptio we have θ h θ 18 s λ κ + ρ 1 with probability at least 1 ɛ 1 ɛ. This completes the proo o Theorem.1.

A. Proo o Theorem 3.1 I the sequel with a slight abuse o otatio we use a equivalet represetatio o Assumptio 1 or writig P { U k E[U k A{logp/ 1/ or all k [p 1 ɛ to replace 3. otig that we assume p >. Hereater we also slight abuse o otatio ad do ot distiguish logp/ rom log p/. Theorem A.1 Theorem 3.1. Assume Assumptio 11 holds with γ = 1. Further assume h K 1 {logp/ 1/ or positive absolute costat K 1 ad assume h C 0 or positive costat C 0. We also take λ 4A + A {logp/ 1/ + 8κ xm ζh where A ={16 31 + c 1/ M + 4 3C 1 1 + c 1/ M 1/ K 1/ 1 + 8C 1 + c + 8C 3 1 + c 3/ M 1/ K M 1/ K 1/ 1 + 8C 4 1 + c M K K 1 1 + 8M c + κ x κ u. or positive absolute costat c M = M + MM K C 0 ad C 1... C 4 as deied i 3.4. Suppose we have { > max 64c + c + 1{logp 3 /3 3 48 6M K κ xq 10 K 1 p{logp 1/ 6 6M κ xq /3 144κ 4 x p K1 p logp [ 11 6 3 + c 1/ C 1 M 1/ K 1/ K M 1/ κ x 4/3 q 4/3 {logp 1/3 1 [ 8 6 0 + 7.5c + cc M κ 1/ x q 1/ logp [ 8 6c + 3/ C 3 {144 + c M K M κ 4 xk1 1 + 19M κ4 x + 8M κ 4 x 1/ 4 5 q 4 9 5 {logp [ 10 6 6 + c 3 C 4 κ /3 x q /3 {logp 5/3 K 1 11 6 0 + 7.5cc + M κ x q{logp 6 3q 0 + 7.5cM κ x logp 0 {3M κ x + M MK C 0 κ x Mκ x 6ep 16 q log q 4 K1 M MK κ x logp where q = 305s. The uder Assumptios 4-1 we have β h β 88sλ Ml κ l with probability at least 1 1.54 exp c log p exp c ɛ p where c = κ l M l 64 /6 {3M κ x + M MK C 0 κ x Mκ x. Proo. See Proo o Theorem 3.. 5 3

A.3 Proo o Theorem 3. Theorem A. Theorem 3.. Assume Assumptio 11 holds with a geeral γ 0 1. Further assume h K 1 {logp/ 1/ or positive absolute costat K 1 ad assume h C 0 or positive costat C 0. We also take λ 4A + A {logp/ 1/ + 8κ xm ζh γ where A ={16 31 + c 1/ M + 4 3C 1 1 + c 1/ M 1/ K 1/ 1 + 8C 1 + c + 8C 3 1 + c 3/ M 1/ K M 1/ K 1/ 1 + 8C 4 1 + c M K K 1 1 + 8M c + κ x κ u. or positive absolute costat c M = M + MM K C 0 ad C 1... C 4 as deied i 3.4. Suppose we have { > max 64c + c + 1{logp 3 /3 3 48 6M K κ xq 10 K 1 p{logp 1/ 6 6M κ xq /3 144κ 4 x p K1 p logp [ 11 6 3 + c 1/ C 1 M 1/ K 1/ K M 1/ κ x 4/3 q 4/3 {logp 1/3 1 [ 8 6 0 + 7.5c + cc M κ 1/ x q 1/ logp [ 8 6c + 3/ C 3 {144 + c M K M κ 4 xk1 1 + 19M κ4 x + 8M κ 4 x 1/ 4 5 q 4 9 5 {logp [ 10 6 6 + c 3 C 4 κ /3 x q /3 {logp 5/3 K 1 11 6 0 + 7.5cc + M κ x q{logp 6 3q 0 + 7.5cM κ x logp 0 {3M κ x + M MK C 0 κ x Mκ x 6ep 16 q log q 4 K1 M MK κ x logp where q = 305s. The uder Assumptios 4-1 we have β h β 88sλ Ml κ l with probability at least 1 1.54 exp c log p exp c ɛ p where c = κ l M l 64 /6 {3M κ x + M MK C 0 κ x Mκ x. Proo. We adopt the ramework as described i Sectio or θ = β Γ 0 θ = L 0 β Γ θ h = L β h Γ h θ = E L β h ad take θ h = β which yields s s ad ρ = 0. 5 4

I additio to 3.1 deote 1 U 1k = i<j 1 U k = ad observe that i<j 1 Wij K Xijk ũ ij h h 1 Wij K Xijk XT h h ij βh β k L β { U 1k E[U 1k + U k E[U k + E[U k A.4 where U k is deied i 3.1. Apply Lemma A.0 o D i = X ik u i W i with coditios o lemma satisied by Assumptios 5 6 9 ad 10 ad the we have P{ U 1k E[U 1k A {logp/ 1/ 6.77 exp{ c + 1 log p A.5 or positive absolute costat c ad A as deied i A.48 ad whe > max { 16c + c + 1{logp 3 /3 3. Apply Lemma 3. o Z = X ijk XT ij β h β with coditios o lemma satisied by Assumptios 5 6 9 ad 11 ad the we have E[U k E [ X ijk XT ij β h β W = 0 M + MMK C 0 E [ X ijk XT ij β h β κ xm + MM K C 0 ζh γ. Combiig A.4-A.6 ad Assumptio 1 we have P { or ay k [p k L β A + A {logp/ 1/ + 4κ xm + MM K C 0 ζh γ 1 6.77 exp c log p p ɛ A.6 or positive absolute costat c ad whe we appropriately take bouded rom below. Assume λ 4A + A {logp/ 1/ + 8κ xm + MM K C 0 ζh γ which veriies Assumptio. We veriy Assumptio 3 by applyig Corollary 3. ad complete the proo by Theorem.1. A.4 Proo o Theorem 3.3 Theorem A.3 Theorem 3.3. Assume Assumptio 11 holds with a geeral γ [1/4 1. Further assume h K 1 {logp/ 1/ or positive absolute costat K 1 ad assume h C 0 or positive costat C 0. We also take λ 4A + A + Mη {logp/ 1/ + 8MM K C 1/ κ xh where A ={16 3M 1 + c 1 + 4 3C1 M 1/ K 1 1 1 + c 1 + 8C 1 + c + 8C 3 M 1 K M 1 K 1 1 1 + c 3 + 8C 4 M K K 1 1 1 + c + 8M c + κ x κ u + Cκ x η = E [ X XT W = 0. 5

Here C 1... C 4 are as deied i 3.4 C > ζ C γ 0 ad c > 0 are some absolute costats ad M = M + MM K C 0. Suppose we have { > max C ζ C γ 0 s logp 64c + c + 1{logp 3 /3 3 48 6M K κ xq 10 K 1 p{logp 1/ 6 6M κ xq /3 144κ 4 x p K1 p logp [ 11 6 3 + c 1/ C 1 M 1/ K 1/ K M 1/ κ x 4/3 q 4/3 {logp 1/3 1 [ 8 6 0 + 7.5c + cc M κ 1/ x q 1/ logp [ 8 6c + 3/ C 3 {144 + c M K M κ 4 xk1 1 + 19M κ4 x + 8M κ 4 x 1/ 4 5 q 4 9 5 {logp [ 10 6 6 + c 3 C 4 κ /3 x q /3 {logp 5/3 K 1 11 6 0 + 7.5cc + M κ x q{logp 6 3q 0 + 7.5cM κ x logp 0 {3M κ x + M MK C 0 κ x Mκ x 6ep 16 q log q 4 K1 M MK κ x logp where q = 305{s + ζ h γ / logp. The uder Assumptios 4-6 8-13 we have β h β 88sλ s logp { 88λ Ml + + κ l Ml κ l logp + ζ h γ with probability at least 1 19.31 exp c log p exp c ɛ p where c = κ l M l 64 /6 {3M κ x + M MK C 0 κ x Mκ x. Proo. We adopt the ramework as described i Sectio or θ = β Γ 0 θ = L 0 β Γ θ h = L β h Γ h θ = E L β h. We take θ h = β h such that or each j [p The uder Assumptio 11 we have β h j = { β hj i β h j > {logp/1/ ; 0 i otherwise. ρ s logp/ + ζ h γ s s + ζ h γ logp. 5 A.7 A.8 We veriy Assumptio by applyig Lemma A.4 below with A = A + A veriy Assumptio 3 by applyig Corollary 3. uder Assumptio 13 ad complete the proo by Theorem.1. 6

Lemma A.4. Assume h K 1 {logp/ 1/ or positive absolute costat K 1 ad assume h C 0 or positive costat C 0. Deote η = E [ X XT W = 0. We also take λ 4A + A + A + Mη {logp/ 1/ + 8MM K C 1/ κ xh where A ad A are as speciied i A.48 ad C > ζ C γ 0 is some positive absolute costats. Suppose we have > max { C ζ C γ 0 s logp 64c + c + 1{logp 3 /3 3 or positive absolute costat c > 0. The uder Assumptios 4-6 8-1 we have P k L β h h λ or all k [p 1 13.54 exp c log p ɛ p. Proo o Lemma A.4. I additio to 3.1 deote 1 1 U 1k = K h i<j 1 U k = i<j 1 U 3k = i<j Wij h Xijk ũ ij 1 Wij K Xijk XT h h ij β β h 1 Wij K Xijk XT h h ij βh β h ad observe that k L β h h U 1k E[U 1k + U k E[U k + U k E[U k + E[U 3k A.9 where i decomposig the let had side we have utilized the act that E[ k L β h h = 0. Result o A.0 holds thus boudig U 1k E[U 1k i.e. P { U 1k E[U 1k A {logp/ 1/ 6.77 exp{ c + 1 log p. A.10 We boud the rest o the compoets o the right had side o the last display. We have β β h s logp/+ζ h γ < C or some positive absolute costat C > ζ C γ 0 whe > C ζ C γ 0 s logp. Apply Lemma A.0 o D i = X ik Xi Tβ β h W i with coditios o lemma satisied by Assumptios 5 6 9 ad that β β h < C ad we have P { U k E[U k A {logp/ 1/ 6.77 exp{ c + 1 log p A.11 or positive costats A ad c ad whe we assume > max { 64c + c + 1{logp 3 /3 3. Here A is as speciied i A.48. Apply Lemma 3.3 with coditios o lemma satisied by Assumptios 5 Lemma A.15 ad 6 Lemma A.16 ad we have E[U 3k ME [ X ijk XT ij β h β h Wij = 0 + MM K h E [ X ijk XT ij β h β h Mη {logp/ 1/ + MM K C 1/ κ xh A.1 where the secod iequality is due to Cauchy-Schwarz ad Assumptio 9 Lemmas A.17 ad A.18. 7

Combiig A.9-A.1 ad Assumptio 1 we have P { or ay k [p k L β h h { A + A + A + Mη { logp 1/ + 4MM K C 1/ κ xh 1 13.54p exp{ c + 1 log p ɛ p or positive absolute costat c ad whe we appropriately take bouded rom below. Here A ad A are as speciied i A.48. Assume λ 4A +A +A+Mη {logp/ 1/ +8MM K C 1/ κ xh. This completes the proo. A.5 Proo o Theorem 3.4 Theorem A.5 Theorem 3.4. Assume h C 0 or positive costat C 0 ad that h 4MM K κ x 1. Uder Assumptios 4-6 7 8-9 ad 14 ad whe g is L α-hölder or α 1 g has bouded support whe α > 1 we have where { ζ = max 4 L α MM K + MM K Eũ / β h β ζh where L α is the Lipschitz costat or g L α = L whe α = 1. 1/ 16κ x M + MM K C 0 1/ L αmm K Proo. Reer to Proo o Theorem 3.5 whe g is L 1-Hölder takig M g = L ad M d = M a = 0 i which case Assumptio 15 is ot eeded. Note that higher-order Hölder with compact support implies L 1-Hölder. Thus we complete the proo. A.6 Proo o Theorem 3.5 Theorem A.6. Assume h C 0 or positive costat C 0 ad that h 4MM K κ x 1. Uder Assumptios 4-6 7 8-9 ad 14-15 we have where β h β ζh γ { M g MM K C α γ 0 + Md ζ = max 4 M ac 1 γ 0 + MM K Eũ C γ 0 / 1/ 16κ x M + MM K C0 1/ Mg MM K C α γ 0 + Md M ac 1 γ 0 1/ γ = α i M d M a = 0 ad γ = mi { α 1/ i otherwise. Proo o Theorem 3.5. We prove the lemma i three steps. Step I. We show that L 0 βh L 0β is lower bouded or L 0 β = E [ Ỹ X T β W = 0 0. By Assumptios 8 ad 7 W we have L 0 β [ λ mi β = λ mi E X XT W = 0 W 0. 8

Thereore or some β t = β h + tβ β h t [0 1 we have L 0 β h L 0β = 1 β h β T L 0 β β β=βt β h β β h β. Step II. We show that L h β L 0 β is upper bouded. Observe that [ 1 L h β L 0 β E h K W { h X T β β E [ { X T β β W = 0 W 0 + E h K Wij {gwi gw j h [ 1 + E h K W ũ E [ ũ W = 0 h W 0 [ 1 + E h K Wij XT β β { gw i gw j. h Ad we boud each compoet o the right had side o above iequality. By Taylor s expasio we have [ 1 E h K W { h X T β β E [ { X T β β W = 0 W 0 1 w = h K v h W XT β β w v dw df XT β β v = = v W XT β β 0 v df XT β β v + W XT β β Kwv { W XT β β wh v W XT β β 0 v dw df XT β β v Kwv { W XT β β w v wh w 0v w v τwhv w h dw df XT β β v w A.13 where because W X T β β ad W X T β β are idetically distributed we have Kwv { W XT β β w v wh dw df w 0v XT β β v = Kwv { W XT β β w v + W w v XT β β wh dw df w 0v w 0 v XT β β v =0. 0 Thereore usig Assumptios 5 6 Lemmas A.15 ad A.16 ad 14 we urther have [ 1 E h K W { XT β β E [{ h XT β β W = 0 W 0 = Kwv { W w v XT β β τwhv w w h dw df XT β β v MM K E [ { X T β β h MM K κ x β β h. A.14 9

Usig a idetical argumet by Assumptios 5 6 Lemmas A.15 ad A.16 ad iite secod momet assumptio E[ũ < we have [ 1 E h K W ũ E[ũ W = 0 h 0 W MM K E[ũ h. A.15 By Assumptio 15 we have E h K Wij {gwi gw j h Mg E h K W W α + Md h E h K Wij 1I { W i W j A h Mg E h K W W α + Md h M ah where E h K W W α = Kw w α h α wh dw MM h W K h α. Thereore we have E h K Wij {gwi gw j Mg MM K h α + Md h M ah. A.16 By A.14 A.16 ad applyig Hölder s iequality we also have [ 1 E h K Wij XT h ij β β { gw i gw j E h K Wij { XT h ij β β 1/ E h K Wij {gwi gw j 1/ A.17 h MM K κ x β β h + κ x β β M 1/ M g MM K h α + Md M ah 1/ a 1 β β h γ where γ = α i M d M a = 0 ad γ = mi { α 1/ i otherwise ad a 1 = κ x M + MM K C0 1/ Mg MM K C α γ 0 + Md M ac 1 γ 0 1/. Combiig A.13-A.17 we have L h β L 0 β a 1 β β h γ + a h γ + a 3 β β h where a = M g MM K C α γ 0 + M d M ac 1 γ 0 + MM K Eũ C γ 0 ad a 3 = MM K κ x. Step III. We combie Step I ad Step II ad veriy Assumptio 11. Usig results rom Step I ad Step II we have β h β L 0 β h L 0β Whe h /a 3 we have = L 0 β h L hβ h + L hβ L 0 β + L h β h L hβ L 0 β h L hβ h + L hβ L 0 β a 1 β h β h γ + a h γ + a 3 β h β h. β h β 4a 1 β h β h γ + 4a h γ 10

which urther implies that This completes the proo. A.7 Proo o Corollary 3.1 { βh 8a 1/ 8a 1 β max h γ. Corollary A.1 Corollary 3.1. Assume h K 1 {logp/ 1/ or positive absolute costat K 1 ad assume h C 0 or positive costat C 0. We deote c to be some positive absolute costat c = κ l M l 64κ lm l /6 {3M κ x + M MK C 0 κ x Mκ x M = M + MM K C 0 ad C 1... C 4 as deied i 3.4 Also deote ad τ 1 = + c 1/ κ x K 1 1 BM KC a 0 + DM K τ = + c 1/ κ x {BM K M1 + C 0 C a 0 + DM τ 3 = 4M KM BC a 0 + D 1 + C 0 κ x τ 4 = { 4B MM K κ x1 + C 0 C a γ 1 0 + D 1M κ 4 x 1/ E 1/ C 1/ γ 1 0 τ 5 = 4 + cκ x{bmm K 1 + C 0 C a 0 + D M M K K 1 1 MK K γ 1 1 A ={16 3M 1 + c 1 + 4 3C1 M 1 K 1 1 1 + c 1 + 8C 1 + c + 8C 3 M 1 K M 1 K 1 1 1 + c 3 + 8C 4 M K K 1 1 1 + c + 8M c + κ x κ u + Cκ x A =4τ 1/ 3 1 + c 1/ + C 1 τ 1/ 4 1 + c 1/ + C τ 1 + c + C 3 τ 1/ 5 1 + c 3/ + C 4 τ 1 1 + c + 4M BC a 0 + D c + κ x 11

where γ 1 = mi { a 1 1/. Cosider lower boud o { > max 64c + c + 1{logp 3 /3 64c + c + 1τ τ3 1 {logp4 {logp 5/3 3 48 6M K κ xq 10 K 1 p{logp 1/ 6 6M κ xq /3 144κ 4 x p K1 p logp [ 11 6 3 + c 1/ C 1 M 1/ K 1/ K M 1/ κ x 4/3 q 4/3 {logp 1/3 1 [ 8 6 0 + 7.5c + cc M κ 1/ x q 1/ logp [ 8 6c + 3 C 3 {144 + c M K M κ 4 xk1 1 + 19M κ4 x + 8M κ 4 x 1 4 5 q 4 9 5 {logp [ 10 6 6 + c 3 C 4 κ /3 x q /3 {logp 5/3 K 1 11 6 0 + 7.5cc + M κ x q{logp 6 3q 0 + 7.5cM κ x logp 0 {3M κ x + M M K C 0 κ x Mκ x 16 q log 4 K1 M MK κ x logp. 6ep q 5 A.18 Here q B D E ad a are to be speciied i dieret cases. Suppose that Assumptios 4-6 7 8-10 ad 14 hold. 1 Assume that g is L α-hölder or α 1 ad g has bouded support whe α > 1. Also suppose A.18 holds with q = 305s. We take B = L α where L α is the Lipschitz costat or g L α = L whe = 1 D = E = 0 a = 1 ad assume λ 4A + A {logp/ 1/ + 8κ xm ζh where ζ = max The we have { 4 L α MM K + MM K Eũ / β h β 88sλ Ml κ l with probability at least 1 17.81 exp c log p exp c. 1/ 16κ x M + MM K C 0 1/ L αmm K. Assume that Assumptio 15 holds with α 0 1. Suppose that A.18 holds with q = 305s ad we take B = M g D = M d E = M a ad a = α. Further assume that 1

λ 4A + A {logp/ 1/ + 8κ xm ζh γ where { M g MM K C α γ 0 + Md ζ = max 4 M ac 1 γ 0 + MM K Eũ C γ 0 / 1/ 16κ x M + MM K C0 1/ Mg MM K C α γ 0 + Md M ac 1 γ 0 1/ where γ = α i M d M a = 0 ad γ = mi { α 1/ i otherwise. The we have β h β 88sλ Ml κ l with probability at least 1 17.81 exp c log p exp c. 3 Assume that Assumptio 15 holds with α [1/4 1. Suppose that A.18 holds with q = 305{s + ζ h γ / logp ad take B = M g D = M d E = M a ad a = α. Deote C to be some positive absolute costat C > ζ C γ 0 ad suppose C ζ C γ 0 s logp where { M g MM K C α γ 0 + Md ζ = max 4 M ac 1 γ 0 + MM K Eũ C γ 0 / 1/ 16κ x M + MM K C0 1/ Mg MM K C α γ 0 + Md M ac 1 γ 0 1/ where γ = α i M d M a = 0 ad γ = mi { α 1/ i otherwise. Further assume λ 4A + A + Mη {logp/ 1/ + 8MM K C 1/ κ xh. The we have β h β 88sλ s logp { 88λ Ml + + κ l Ml κ l logp + ζ h γ with probability at least 1 4.58 exp c log p exp c. Proo. We prove the corollary or the case whe g is Lipschitz. We veriy Assumptios 11 ad 1 ad the apply Theorem 3.1. Assumptio 11 is veriied by applyig Theorem 3.4 ad Assumptio 1 is veriied by applyig Lemma A.1. We complete the proo by Theorem 3.1. The rest o the corollary ca be proved based o similar argumets. A.8 Proo o Lemma 3.1 Lemma A.7 Lemma 3.1. Assume h K 1 {logp/ 1/ or positive absolute costat K 1 ad assume h C 0 or positive costat C 0. Further assume λ 4A + A {logp/ 1/ + 4 M g M K Mκ x 1 + C 0 h. Here A is as speciied i A.48 ad A as i A.53. Suppose we have > max { 64c + c + 1{logp 3 /3 64c + 3 c + 1{logp 4 {logp 5/3 3 or positive absolute costat c > 0. The uder Assumptios 5 6 ad 9 10 16 we have P k L β h λ or all k [p 1 1.04 exp c log p. 13

Proo. Deote 1 U 1k = i<j 1 U k = i<j 1 Wij K Xijk ũ ij h h 1 Wij { K Xijk gwi gw j h h ad observe that k L β h { U 1k E[U 1k + E[U 1k + U k E[U k + E[U k. A.19 Apply Lemma A.0 o D i = X i u i W i with coditios o lemma satisied by Assumptios 5 6 9 10 we have P U 1k E[U 1k A{logp/ 1/ 6.77 exp{ c + 1 log p A.0 or positive absolute costat A ad c ad whe assumig > max { 64c+ c+1{logp 3 /3 3. Here A is as speciied i A.48. Apply Lemma A.1 o D i = X i gw i W i with coditios o lemma satisied by Assumptios 5 6 9 16 we have P U k E[U k A {logp/ 1/ 5.7 exp{ c + 1 log p A.1 or positive costats A ad c ad whe assumig > max { 64c+ 3 c+1{logp 4 {logp 5 3. Here A is as speciied i A.53. By idepedece o u ad X W we have E[U 1k = 0. We also have Wij E[U k M g E K Xijk Wij h h =M g Kw xwh Wij X w x dw df ijk Xijk x =M g { Kw xwh Wij X 0 x + ijk M g M K ME [ X ijk Wij = 0 h + M g M K ME[ X ijk h M g M K Mκ x 1 + C 0 h A. Wij X w x ijk w wh dw df twhx Xijk x A.3 where the irst iequality is by Assumptio 16 the secod equality is by deiitio the third equality by Taylor s expasio at w = 0 t [0 1 the third iequality is by Assumptios 5 Lemma A.15 ad 6 Lemma A.16 ad the last iequality is by Assumptio 9 Lemma A.17. Combiig A.19-A.3 we have P { or ay k [p k L β h A + A {logp/ 1/ + M g M K Mκ x 1 + C 0 h 1 1.04 exp c log p or positive absolute costat c ad whe we appropriately take bouded rom below. Thus we 14

have completed the proo by otig that λ 4A + A {logp/ 1/ + 4 M g M K Mκ x 1 + C 0 h. A.9 Proo o Theorem 3.6 Theorem A.8 Theorem 3.6. Assume h K 1 {logp/ 1/ or positive absolute costat K 1 ad assume that h C 0 or positive costat C 0. Further assume λ 4A+A {logp/ 1/ + 4 M g M K Mκ x 1 + C 0 h where A ={16 3M 1 + c 1/ + 4 3C 1 M 1/ K 1/ 1 1 + c 1/ + 8C 1 + c + 8C 3 M 1/ K M 1/ K 1/ 1 1 + c 3/ + 8C 4 M K K 1 1 1 + c + 8M c + κ x κ u A =8MM K M g C 0 1 + C 0 κ x 1 + c 1/ + C 1 M g M 1/ M 3/ K κ1/ x 1 + C 0 1/ C 5/4 0 K 1/4 1 1 + c 1/ + C MM K M g 1 + C 0 κ x K 1 1 + c 3/ + 4C 3 MM 3/ K M 1/ g 1 + C 0 1/ C 1/ 0 κ x 1 + c + C 4 M K M g C 0 κ x K 1 1 1 + c5/ + MM K M g 1 + C 0 C 0 or positive absolute costat c M = M + MM K C 0 ad C 1... C 4 as deied i 3.4. Suppose we have { > max 64c + c + 1{logp 3 /3 64c + 3 c + 1{logp 4 {logp 5/3 3 48 6M K κ xq 10 K 1 p{logp 1/ 6 6M κ xq /3 144κ 4 x p K1 p logp [ 11 6 3 + c 1/ C 1 M 1/ K 1/ K M 1/ κ x 4/3 q 4/3 {logp 1/3 1 [ 8 6 0 + 7.5c + cc M κ 1/ x q 1/ logp [ 8 6c + 3 C 3 {144 + c M K M κ 4 xk1 1 + 19M κ4 x + 8M κ 4 x 1 4 5 q 4 5 {logp 9 5 [ 10 6 6 + c 3 C 4 κ x K 1 /3 q /3 {logp 5/3 11 6 0 + 7.5cc + M κ x q{logp 0 {3M κ x + M M K C 0 κ x Mκ x 16 q log 4 K1 M MK κ x logp. where q = 305s. The uder Assumptios 4-10 ad 16 we have 6 3q 0 + 7.5cM κ x logp 6ep q A.4 β h β 88sλ Ml κ l 15

with probability at least 1 17.81 exp c log p exp c where c = κ l M l 64κ lm l /6 {3M κ x + M M KC 0κ x Mκ x. Proo o Theorem 3.6. We adopt the ramework as described i Sectio or θ = β Γ 0 θ = L 0 β Γ θ h = L β h Γ h θ = E L β h ad take θ h = β which yields s s ad ρ = 0. We veriy Assumptio by applyig Lemma 3.1 ad veriy Assumptio 3 by applyig Corollary 3.. We complete the proo by Theorem.1. A.10 Proo o Theorem 3.7 Theorem A.9 Theorem 3.7. For q [p suppose that { 48 6MK κ > max xq 384 K 1 p{logp 1/ 6M κ xq /3 144κ 4 x tp K1 p logp [ 768 3 + c 1/ C 1 M 1/ K 1/ 1 t K M 1/ κ x 4/3 q 4/3 {logp 1/3 [ 960 + 7.5c + cc M κ 1/ x q 1/ logp t 3 [ 96c + C 3 {144 + c M K M κ 4 xk1 1 + 19M κ4 x + 8M κ 4 x 1 t [ 384 6 + c 3 C 4 κ x /3q /3 {logp 5/3 K 1 t 7680 + 7.5cc + M κ x q{logp 1q t 0 + 7.5cM κ xt logp 1 {3M κ x + M MK C 0 κ x Mκ x 6ep t q log 16t q 16 K1 M MK κ x logp t or positive absolute costat t ad c > 1. Uder Assumptios 5 6 ad 9 we have T E T q t 4/5q 4 5 {logp 9 5 A.5 with probability at least 1 5.77 exp c log p exp c where c = t 4t/[ 8 {3M κ x + M M K C 0 κ x Mκ x. Proo. We deote 1 X h = h 1/ Σ h = E K h K 1/ Wij h XT ij W h X XT. p to be a p matrix 16

Ad we aim to show that with high probability 1 v T Xh T X h v v T Σ h v θ v or all v R p v 0 q simultaeously holds or some θ > 0 uder coditios o Theorem 3.7. We split the proo ito three steps. Step I. For set J [p cosider E J S p 1 where E J = spa { e j : j J. Costruct ɛ-et Π J such that Π J E J S p 1 ad Π J 1 + ɛ 1 q. The existece o Π J ca be guarateed by Lemma 3 o Rudelso ad Zhou 013. Deie Π = J =q Π J the or 0 < ɛ < 1 to be determied later we have 3 q p 3ep q { 6ep Π = exp q log. ɛ q qɛ q For ay v E J S p 1 let Πv be the closest poit i ɛ-et Π J. The we have v Πv E J S p 1 ad v Πv ɛ. v Πv Step II. Deote D i = W i X i V i or i [ ad D = W X V to be a i.i.d copy. We upper boud { 1 P g v D i D j µ v θ max v Π or some θ > 0 where g v D i D j = 1 Wij K h h X ijv T ad µ v = E[g v D i D j. Also deote v D i = E [ g v D i D j Di. Observe that 1 g v D i D j µ v i<j 1 { { gv D i D j v D i v D j + µ v + v D i µ v. i<j i<j We boud two compoets o the right had side o iequality above separately ad the combie the result. Step II.1. We boud 1 P i=1 i=1 { v D i µ v t A.6 or t > 0 to be determied ad or each v E J S p 1. Apply Lemma 3.3 with coditios o lemma satisied by Assumptios 5 Lemma A.15 ad 6 Lemma A.16 ad we have v D i 1 D i MM K h D i A.7 where 1 D i = E [ X T ij v Wij = 0 D i W W i ad D i = E [ X T ij v X i. Also we have µ v µ 1 MM K h µ A.8 where µ 1 = E[ X T ij v W ij = 0 W 0 ad µ = E[ D i = E[ X T ij v. Ad we boud A.6 as 17

below. We have 1 { P v D i µ v t i=1 =P { e a i=1 vd i µ v e at e at E [ { e a i=1 vd i µ v e at E [ e a [ i=1 { 1 D i µ 1 +MM K h { D i µ e MM K h µ a e at E [ e a i=1 { 1D i µ 1 1/ [ E e MM K C 0 a i=1 { D i µ 1/ e 4κ xmm K h a e at E [ e Ma i1 E[ XT ij v W ij =0D i E[ X ij T v W ij =0 1/ [ E e a κ x i=1 W W i E[ W W i 1/ E [ e MM KC 0 a i=1 { D i µ 1/ e 4κ x MM Kh a [ e at E [ e am i=1 E [ e MM KC 0 a i=1 { X i X i T v E[ X ij T v W ij =0 W i = W i {Xi X i T v µ 1/ e 4κ x MM Kh a e at e M κ 4 x a e M κ 4 x a e M M K C 0 κ4 x a e 4MM Kκ x ha 1/ e a κ 4 x M 1/ or 0 < a 4Mκ x 1 where the irst iequality is by Markov s the secod is a applicatio o A.7 ad A.8 the third is by Cauchy-Schwarz ad the result that µ κ x Assumptio 9 Lemma A.17 ad Lemma A.18. The ourth iequality is by otig that W 0 = E[ W W i ad applyig the ollowig iequality V 1 V E[V 1 E[V V 1 E[V 1 V + E[V 1 V E[V where V 1 = E[ X ij Tv W ij = 0 D i E[V 1 κ x by Assumptio 9 Lemma A.17 ad Lemma A.18 ad V = W W i [0 M. For the ith iequality the secod compoet i product is bouded due to Jese s iequality where X i W i i = 1... are idepedet copies o X i W i ; the third is bouded because W W i [0 M ad E[ X ij Tv W ij = 0 κ x by Assumptio 9 Lemma A.17 ad Lemma A.18. The sixth iequality is agai a applicatio o Assumptio 9 Lemma A.17 ad Lemma A.18. Take a = 1 t a 1 1 ad h t 4a 1 where a 1 = M κ 4 x + M MK C 0 κ4 x + M κ 4 x Mκ x ad a = 4MM K κ x. The we urther have 1 { { t t P v D i µ v t exp. 8a 1 i=1 By the same argumet we have 1 { P v D i µ v t i=1 i=1 { t t exp. 8a 1 We take t = θ/4 ad have 1 { θ { θ 4θ P v D i µ v exp. A.9 4 18a 1 18

Step II.. Observe that 1 { 1 { gv D i D j v D i v D j + µ v s max ϕ kl D i D j kl i<j i<j where ϕ kl D i D j = 1 [ Wij 1 Wij K Xijk Xijl E K h h h Wij [ E K Xijk XijlDj 1 + E h h We the boud i<j ϕ kld i D j or each k l [p. h Xijk Xijl D i h K Wij h Xijk Xijl. Apply trucatio X ik E[X ik τ / or each i [ k [p ad τ = 6+c 1 κ x {logp 1 or positive absolute costat c. Deie evets A i = { X ik E[X ik τ k [p A [ = { X ik E[X ik τ i [ k [p. Cosider trucated U-statistic i<j ϕ kld i D j where ϕ kl D i D j = 1 [ Wij 1 Wij K Xijk Xijl 1IA i A j E K Xijk Xijl D i 1IA i h h h h Wij [ E K Xijk Xijl 1 Wij D j 1IA j + E K Xijk Xijl. h h h h First we boud E[ϕkl D i D j. We have E[ϕ kl D i D j [ 1 [ { Wij 1 = E K Xijk Xijl 1IA c i A c Wij h h j E E K Xijk XijlDi 1IA c i h h A.30 [ 1 [ { Wij E K Xijk Xijl 1IA c i A c E 1 Wij h h j + E K Xijk Xijl D i 1IA c. i h h We have [ 1 Wij E K Xijk Xijl 1IA c i A c 1 h h j MK E[ h X X ijk ijl 1/ PA c i A c j 1/ M K 1 h E[ X 4 ijk 1/4 E[ X 4 ijl 1/4 PA c i A c j 1/ M K 1 h 1κ 4 x 1/ p 1 3 p 3 1/ 6M K κ x K 1 p{logp 1/ θ 4q A.31 where the irst ad secod iequalities are by Cauchy-Schwarz the third is by subgaussiaity o X i X j the ourth is by choice o h ad the last holds true whe we have 48 6M K κ xq K 1 θ{logp 1/ p. 19

We also have [ { 1 Wij [ { E E K Xijk Xijl D i 1IA c 1 Wij 1/ i E E K Xijk Xijl D i PA c h h h h i 1/ { 4M + M K C 0 κ 4 x 1/ θ 48q 1 3/ p A.3 where the irst ieuqlity is by Cauchy-Schwarz the secod is by A.51 ad subgaussiaity o X i Assumptio 9 ad the last holds true whe we have { 96 6M + MMK C 0 κ xq /3. θp Combiig A.30 A.31 ad A.3 we have E[ϕ kl D i D j θ 1q A.33 whe we appropriately choose bouded rom below. Next we boud i<j ϕ kld i D j by applyig Lemma 3.4. We boud costats i Lemma 3.4 as ollows. For boudig B g we have B g 4M K τ h 1 boudig B we have E [ ϕ kl D i D j D j E K h + E K h E K h Wij h X ijk Xijl 1IA i A j [ { 1 D j + E E Wij h X ijk Xijl D j 1IA j + E h X ijk Xijl 1IA i A j D j + E Wij + E K h Wij h X ijk Xijl. {4 6 + cm K κ x K 1 1 { logp1/. For Wij h X ijk Xijl D i 1IA i K h Wij K h h X ijk Xijl h K Wij h X ijk Xijl D j 1IA j A.34 Apply Lemma 3.3 o ϕ = 1 with M 1 = M ad M = M K as give by Assumptios 6 Lemma A.16 ad 5 Lemma A.15 we have Wij E K h h X ijk Xijl 1IA i A j D i A.35 τtm + MM K C 0 = 6c + M + MM K C 0 κ x logp. Apply Lemma 3.3 o ϕ = X ijk Xijl with M1 = M ad M = M K as give by Assumptios 6 0

Lemma A.16 ad 5 Lemma A.15 we have Wij E K h h X ijk Xijl D j 1IA j M E [ X ijk Xijl Dj W ij = 0 1IA j + MM K C 0 E [ X ijk Xijl Dj 1IAj ME[ X ijk D j W ij = 0 1/ E[ X ijl D j W ij = 0 1/ 1IA j + MM K C 0 E[ X ijk D j 1/ E[ X ijl D j 1/ 1IA j 1.5c + 4 M + MM K C 0 κ x logp where the secod iequality is by Cauchy Schwarz ad the last is due to E[ X ijk Dj 1IA j = { E[X ik E[X ik + X ik E[X jk 1IA j κ x + τ /4 1.5c + 4κ x logp ad based o a idetical argumet E[ X ijk D j W ij = 0 1IA j 1.5c + 4κ x logp A.36 or ay k [p. Apply Lemma 3. o Z = X ijk Xijl ad with M 1 = M M = M K as give by Assumptios 6 Lemma A.16 ad 5 Lemma A.15 we have Wij E K h h X ijk Xijl M + MM K C 0 κ A.37 x Combiig A.34-A.37 we have B 0 + 7.5c M + MM K C 0 κ x logp. For boudig E [ E { ϕ kl D i D j Dj we observe that ϕ kl D i D j = 1 [ Wij 1 Wij K Xijk Xijl 1IA i A j E K Xijk Xijl 1IA i A j D i h h h h Wij E K Xijk Xijl 1IA i A j Wij D j + E K Xijk Xijl 1IA i A j h h h h Wij + E K Xijk Xijl 1IA i A j Wij D i E K Xijk Xijl D i 1IA i h h h h Wij + E K Xijk Xijl 1IA i A j Wij D j E K Xijk Xijl D j 1IA j h h h h [ Wij 1 Wij + E K Xijk Xijl E K Xijk Xijl 1IA i A j h h h h which urther implies that E [ ϕ kl D i D j D j [ 1 [ Wij E K Xijk Xijl 1IA c E 1 Wij h h j + K Xijk Xijl 1IA c i D j h h Wij + E K Xijk Xijl 1IA c i A c h h j 1

Thereore we have E [ E { ϕ kl D i D j D j 3 { E[ h Xijk Xijl 1IA c j + E[ X ijk Xijl 1IA c i + E[ X ijk Xijl 1IA c i A c j 3 { E[ K1 logp X4 ijk 1/ E[ X ijl 4 1/ PA c j + E[ X ijk 4 1/ E[ X ijl 4 PAc i A c j 3 K1 logp 1 1κ4 x 3 p + 1κ4 x 3 p 1 where the irst iequality is due to the act that K [0 1 ad by Jese s iequality the secod is by Cauchy-Schwarz the third by subgaussiaity o X i X j ad X ij ad last holds true whe we have 144κ 4 x K1 p logp. For boudig σ apply Lemma 3. o Z = X ijk X ijl with M 1 = M ad M = M K as give by Assumptios 6 Lemma A.16 ad 5 Lemma A.15 we have σ 16M [ K 1 Wij E K X h h h ijk X ijl 16M K { [ M E X h ijk X ijl Wij = 0 + MM K C 0 E [ X ijk X ijl 16M { K ME[ h X 4 ijk Wij = 0 1/ E[ X 4 ijl Wij = 0 1/ + MM K C 0 E[ X ijk 4 1/ E[ X ijl 4 1/ 19M KM + MM K C 0 κ 4 { x 1/ K 1 logp where the third iequality is by Cauchy-Schwarz ad the last is by subgaussiaity o X ad choice o h. For boudig B we have B = sup E [ ϕ kl D i D j D j D j 4M K h 1 + 4 sup E D j + 4 sup E D j Wij sup E K X D j h h ijk X ijl 1IA i A j D j [ { 1 E K h [ { 1 E K h Wij + 4E K Xijk Xijl h h Wij Xijk Xijl D i 1IAi h Wij Xijk Xijl D j 1IAj h 4M KM τ 4 + 19M h κ4 x + 8M κ x {144 + c M K M κ 4 xk1 1 + 19M κ4 x + 8M κ 4 x { logp 3/

where M = M + MM K C 0. We take t = θ 1q u = + c log p ad require that { 48 6MK κ > max xq 96 K 1 p{logp 1/ 6M κ xq /3 144κ 4 x θp K1 p logp 9 3 + c 1/ C 1 M 1/ K 1/ 1 θ K M 1/ κ x 4/3 q 4/3 {logp 1/3 [ 40 + 7.5c + cc M κ 1/ x q 1/ logp θ 3 [ 4c + C 3 {144 + c M K M κ 4 xk 1 θ 1 + 19M κ4 x + 8M κ 4 x 1/ [ 96 6 + c 3 C 4 κ x /3q /3 {logp 5/3 K 1 θ 190 + 7.5cc + M κ x q{logp 1q θ 0 + 7.5cM κ xθ logp 4 5 q 3 {logp 9 5 A.38 or some positive absolute costat c ad C 1... C 4 as deied i 3.4. The by Lemma 3.4 we have 1 P ϕ kl D i D j E[ϕ kl D i D j 5θ 1q i<j exp{ + c log p +.77 exp{ + c log p Combied with A.30 the last display urther implies that 1 P ϕ kl D i D j θ q i<j 1 P ϕ kl D i D j θ q A [ + PA c [ i<j 1 P ϕ kl D i D j E[ϕ kl D i D j i<j 5θ + PA c [ 1m exp{ + c log p +.77 exp{ + c log p + p exp{ + c logp 5.77 exp{ 1 + c log p or positive absolute costat c. 3

Step II.3 Combiig results o Step II.1 Step II. ad Step I whe we have A.38 ad that { 56{3M κ x + M MK > max C 0 κ x Mκ x 3ep θ q log 4096K 1 M MK κ x logp 4θ qɛ θ we have P max v Π { 1 i<j g v D i D j µ v θ 5.77 exp{ c + 1 log p + exp c where c = θ 4θ/[56{3M κ x + M MK C 0 κ x Mκ x. Step III. Deote 1/ Γ = X h Σ1/ h. From Step II. we have that with probability at least 1 5.77 exp{ c + 1 log p exp c simultaeously or all v 0 Π which urther implies that Γv 0 θ Γv 0 θ 1/. The we obtai bouds o etire E J S p 1 by approximatio. For ay v E J S p 1 or some J = q deote v 0 = Πv. We have Deie Γ EJ which urther implies that = sup y EJ S p 1 Take ɛ = 1/ the we have Γv ΓΠv + Γ{v Πv. Γy. The by A.39 we have Γ EJ θ 1/ + ɛ Γ EJ Γ E J We take θ = 4θ. This completes the proo. A.11 Proo o Lemma 3.4 θ 1 ɛ. Γ E J 4θ. A.39 Proo. Deote µ = E [ gz 1 Z z = z µ gz i Z j = gz i Z j Z i Z j + µ ad D g = i<j gz i Z j. Also deote g = B g = B σ = E [ gz 1 Z ad B = sup E [ gz z z D = sup {E [ gz i Z j a i Z i b j Z j : E [ i<j i= a i Z i 1 E [ 1 j=1 b j Z j 1. 4

Hoedig decompositio gives us U g E[U g = 1 Z i + D g where D g is a degeerate U-statistic o bouded kerel. By Berstei iequality we have P Z i t t /8 1 exp 1 i=1 E [ Zi + B t/6 1 t / A.40 exp 8E [ Zi + B t/ whe 3. By Theorem 3.4 i Houdré ad Reyaud-Bouret 003 or ay u > 0 we have P D g C 1 σu 1/ + C Du/4 + C3 Bu 3/ + C 4 Bg u /4 C 5 e u i=1 A.41 where positive absolute costats C 1... C 5 are as deied i 3.4. Combiig A.40 ad A.41 we have P U g E[U g t + C 1 σu 1/ + C Du/4 + C3 Bu 3/ + C 4 Bg u P Z i i=1 t 1 t / exp 8E [ X + B + C 5 e u. t/ + P D g C 1 σu 1/ + C Du/4 + C3 Bu 3/ + C 4 Bg u /4 A.4 It is easy to see that B g B g + 3B 4B g B B ad E [ Z E [ Z. It remais to boud σ B ad D. By some algebra we have which implies that ad that Meawhile we have E [ gx 1 X X E [ gx1 X X σ = E [ gx 1 X = E [ E { gx 1 X X E [ E { gx 1 X X = E [ gx 1 X = σ B sup X E [ gx 1 X X sup X E [ gx 1 X X = B. E [ gx i X j X j 4B. 5

By Hölder s iequality ad combiig with the last display we have E [ gx i X j a i X i b j X j =E [ b j X j E { gx i X j a i X i X j Thereore we urther have E [ b j X j E { gx i X j Xj 1/E { gxi X j a i X i Xj 1/ 4B 1/ E [ b j X j E { gx i X j a i X i X j 1/ 4B 1/ E [ b j X j 1/ E [ gxi X j a i X i 1/ =4B 1/ E [ b j X j 1/ E [ ai X i E { gx i X j X i 1/ 4B E [ a i X i 1/ E [ bj X j 1/. D 4B 4B 4B. i 1 { [ E ai X i 1/ [ E bj X j 1/ i= j=1 i 1 i= j=1 1{ [ E ai X i + E [ b j X j Combiig these upper bouds o costats with A.4 we complete the proo. A.1 Proo o Corollary 3. Corollary A. Corollary 3.. Suppose Assumptios 4-6 ad 8-9 are satisied. 1 Assume Assumptio 7 holds ad that A.5 is satisied with q = 305s ad t = /16. The we have P δ L h κ lm l 4 or all { R p : S c 1 3 S 1 1 5.77 exp c log p exp c where c > 1 is a absolute costat ad c = κ l M l 64κ lm l /6 {3M κ x+m M K C 0 κ x Mκ x. Assume Assumptio 13 holds ad that A.5 holds with q = 305{s+ζ h γ / logp ad t = /16. The we have P δ L h κ lm l 4 or all C S 1 5.77 exp c log p exp c where C S = { v R p : v J c 1 3 v J 1 or some J [p ad J s + ζ h γ / logp c > 1 is a absolute costat ad c = κ l M l 64κ lm l /6 {3M κ x + M MK C 0 κ x Mκ x. Proo. 1 Deote C S = { v R p : v S c 1 3 v S 1. By Lemma 13 i Rudelso ad Zhou 013 C S S p 1 cov J d E J S p 1 where cov meas covex hull o a set E J = spa { e j : 6

j J ad d = 305s. Deote Γ = For ay v C S S p 1 we have Σ h = E W h X XT K h 1 { 1 Wij K Xij XT h i<j h ij Σ h [ Σ 0 = E X XT W = 0 0. W v T Γv 4 max v T Γv v cov J d E J S p 1 = 4 max v J d E J S p 1 v T Γv = 4 Γ d where the secod lie is because maximum o v T Γv occurs at extreme poits o set cov J d E J S p 1. Apply Theorem 3.7 with q = d = 305s ad t = /16 whe A.5 is satisied we have v T Γv κ lm l A.43 4 holds simultaeously or all v C S S p 1 with probability at least 1 5.77 exp c log p exp c where c > 1 is some absolute costat ad c = κ l M l 64κ lm l /[65536{M κ x + M MK C 0 κ x + M κ x Mκ x. A.43 urther implies that δ L v h v T Σ h v /4 where v T Σ h v v T Σ 0 v MM K E [ X T v h v MM K κ x v h v / = /. A.44 Thereore δ L v h /4 holds simultaeously or all v C S S p 1 with probability at least 1 5.77 exp c log p exp c. By liearity o δ L v h this completes the proo or 1. Usig a idetical argumet as used i 1 replacig C S by set { v R p : v J c 1 3 v J 1 or some J [p ad J s + ζ h γ / logp ad usig d = 305{s + ζ h γ / logp istead we complete the proo or. A.13 Proo o Lemma 4.1 Lemma A.10 Lemma 4.1. Assume h K 1 {logp/ 1/ or positive absolute costat K 1 ad assume h C 0 or positive costat C 0. We urther assume that u satisies Assumptio 17 ad take c ad c < 3ɛ/4 + 1/ to be positive absolute costats. We take ξ = 1 + c / + ɛ ad suppose we have { [{16c > max + 3 c + 1C0M u /+ɛ κ 1/3 ξ x 1 log p /3 ξ {logp 5/3 4ξ 7

The uder Assumptios 5 6 9 ad 17 we have P { max Uk E[U k C{logp/ 1/ 4.77 exp c log p + exp c log k [p where C = C 1 M 1/ K M 1/ Mu 1/+ɛ κ x c 1/ K 1/ 1 + C M c + 1/ 1/ c + 8M Mu 1/+ɛ κ x c 1/ + C 3 M 1/ K M 1/ c + 1/ c 3/ K 1/ 1 + C 4 M K c + 1/ c K1 1 Here M = M + MM K C 0 ad C 1... C 4 are as deied i 3.4. Proo. We apply trucatio o X ijk ad ũ i at levels τ ad θ / respectively ad irst ocus o U-statistic 1 1 Wij Ũ k = K Xijk ũ ij 1IA kij B i B j h h where we deote evets We also deote evets A kij = { X ijk τ Bi = { u i E[u θ /. A k[ = { X ijk τ i < j [ B [ = { u i E[u θ / i [. Deote gd i D j = 1 Wij K Xijk ũ ij 1IA kij B i B j ad D i = E [ gd i D j D i. h h We complete the proo i two steps. Step I. We boud B g B E [ D σ ad B as i Lemma 3.4 ad apply Lemma 3.4. For boudig B g we have B g M K τ θ /h. For boudig B apply Lemma 3.3 o ϕ = 1 with lemma coditios satisied by 5 ad 6 ad we have [ E 1 W1 W B τ θ K W 1 M τ θ h h where M = M + MM K C 0. For boudig σ we have σ = E [ gd 1 D M [ K 1 Wij E K Xijk ũ ij h h h M K [ E X h ijk ũ ij Wij = 0 M + MM K C 0 E [ X ijk ũ ij M K M M /+ɛ u κ x/h where the irst iequality is due to K [0 1 the secod iequality is by applyig Lemma 3. o Z = X ijk ũ ij with lemma assumptios satisied by Assumptios 5 ad 6 ad the last iequality is by Assumptios 9 10 ad idepedece o X ijk ad ũ ij. For boudig E [ D apply Lemma 3.3 o ϕ = X ijk ũ ij 1IA kij B i B j with lemma assumptios satisied by Assumptios 5 ad 6 ad we have D 1 D MM K C 0 D 8

where 1 D = E [ X1k ũ 1 1IA k1 B 1 B W 1 = W D W1 W D = E [ X 1k ũ 1 1IA k1 B 1 B D. We have by Assumptios 9 10 ad idepedece o X ijk ad ũ ij E [ 1 D E [ X 1k ũ 1 W 1 = W M MMu /+ɛ κ x This urther implies that E [ D E[ X 1kũ 1 M /+ɛ u κ x. E[D E[ 1 D + M M KC 0E[ D 4M M /+ɛ u κ x. For boudig B we have B = sup E [ gd 1 D D D M [ K 1 W1 W sup E K X 1k X k u 1 u 1IA k1 B 1 B D h D h h τ M K M θ. h We take or some positive absolute costat c > 1 t = 8M M 1/+ɛ u κ x c 1/ {logp/ 1/ τ = max { c 1/ {logp 1/ θ = α 0 < α < 3/4 c u = c log p ad we have that { [{16c > max 3 c + 1C0M u /+ɛ κ 1/3 α x 1 log p /3 α {logp 5/3 4α. The by Lemma 3.4 we have { 1 P Ũk E[Ũk A{logp/ 1/ exp c logp +.77 exp c log p where with C 1... C 4 deied i 3.4 A = C 1 M 1/ K M 1/ M 1/+ɛ 4.77 exp c log p u κ x c 1/ K 1/ 1 + C M c + 1/ 1/ c +C 3 M 1/ K M 1/ c + 1/ c 3/ K 1/ 1 + C 4 M K c + 1/ c K 1 1 + 8M M 1/+ɛ u κ x c 1/. 9

Step II. We have E[Ũk = 0 ad thus we have P { max Uk E[U k A{logp/ 1/ k [p P { max Uk E[U k A{logp/ 1/ B [ + PB c [ k [p p { P Uk > A{logp/ 1/ A k[ B [ + PA c k[ + PB[ c k=1 p { P Ũ > A{logp/ 1/ A k[ B [ + PA c k[ + PB[ c k=1 4.77 exp c log p + log p + E[ ũ +ɛ α+ɛ 4.77 exp c log p + log p + exp c log. The last iequality holds i we take c + 1/ + ɛ < 3/4 ad we take α = c + 1/ + ɛ. This completes the proo. A.14 Proo o Corollary 4.1 Assume h K 1 {logp/ 1/ or positive absolute costat K 1 ad assume h C 0 or positive costat C 0. We urther assume that u satisies Assumptio 17 ad take c ad c < 3ɛ/4 + 1/ to be positive absolute costats. We take ξ = 1 + c / + ɛ ad suppose we have { [{16c > max + 3 c + 1C0M u /+ɛ κ 1/3 ξ x 1 log p /3 ξ {logp 5/3 4ξ 64c + c + 1{logp 3 /3 3 48 6M K κ xq 10 K 1 p{logp 1/ 6 6M κ xq /3 144κ 4 x p K1 p logp [ 11 6 3 + c 1/ C 1 M 1/ K 1/ K M 1/ κ x 4/3 q 4/3 {logp 1/3 1 [ 8 6 0 + 7.5c + cc M κ 1/ x q 1/ logp [ 8 6c + 3/ C 3 {144 + c M K M κ 4 xk1 1 + 19M κ4 x + 8M κ 4 x 1/ 4 5 q 4 5 {logp 9 5 [ 10 6 6 + c 3 C 4 κ x K 1 /3 q /3 {logp 5/3 11 6 0 + 7.5cc + M κ x q{logp 0 {3M κ x + M M K C 0 κ x Mκ x 16 q log 4 K1 M MK κ x logp 6 3q 0 + 7.5cM κ x logp 6ep q A.45 30

where q is to be determied i speciic cases. Deote M = M + MM K C 0 ad C 1... C 4 are as deied i 3.4. Also deote c to be some positive absolute costat ad A = C 1 M 1/ K M 1/ M 1/+ɛ u κ x c 1/ K 1/ 1 + C M c + 1/ 1/ c+ C 3 M 1/ K M 1/ c + 1/ c 3/ K 1/ 1 + C 4 M K c + 1/ c K 1 1 + 8M M 1/+ɛ u κ x c 1/ c =κ l M l 64κ lm l /6 {3M κ x + M M KC 0κ x Mκ x. Theorem A.11 Corollary 4.11. Assume λ 4A + A {logp/ 1/ + 8κ xm ζh. Further assume A.45 holds with q = 305s. The uder Assumptios 4-9 11 1 ad 17 we have β h β 88sλ Ml κ l with probability at least 1 10.54 exp c log p exp c log exp c ɛ p. Proo. See Proo o Theorem A.1. Theorem A.1. [Corollary 4.1 Assume that λ 4A + A{logp/ 1/ + 8κ xm ζh γ. Further assume A.45 holds with q = 305s. The uder Assumptios 4-9 11 1 ad 17 we have β h β 88sλ Ml κ l with probability at least 1 10.54 exp c log p exp c log exp c ɛ p. Proo. We adopt the ramework as described i Sectio or θ = β Γ 0 θ = L 0 β Γ θ h = L β h Γ h θ = E L β h ad take θ h = β which yields s s ad ρ = 0. We veriy Assumptio by usig results A.4 A.6 ad applyig Lemma 4.1. We veriy Assumptio 3 by applyig Corollary 3.. We complete the proo by Theorem.1. Theorem A.13 Corollary 4.13. Deote C to be some positive absolute costat C > ζ C γ 0 ad suppose C ζ C γ 0 s logp. Assume that λ 4A + A + Mη {logp/ 1/ + 8MM K C 1/ κ xh. Further assume that A.45 holds with q = 305{s + ζ h γ / logp. The uder Assumptios 4-6 8-9 11-13 ad 17 we have β h β 88sλ s logp { 88λ Ml + + κ l Ml κ l logp + ζ h γ with probability at least 1 17.31 exp c log p exp c log exp c ɛ p. Proo. We adopt the ramework as described i Sectio or θ = β Γ 0 θ = L 0 β Γ θ h = L β h Γ h θ = E L β h ad take θ h = β which yields s s ad ρ = 0. We veriy Assumptio by usig results A.4 A.6 ad applyig Lemma 4.1. We veriy Assumptio 3 by applyig Corollary 3.. We complete the proo by Theorem.1. 31

Corollary A.3 Corollary 4.14. Deote ad τ 1 = + c 1/ κ x K 1 1 BM KC a 0 + DM K τ = + c 1/ κ x {BM K M1 + C 0 C a 0 + DM τ 3 = 4M KM BC a 0 + D 1 + C 0 κ x τ 4 = { 4B MM K κ x1 + C 0 C a γ 1 0 + D 1M κ 4 x 1/ E 1/ C 1/ γ 1 0 τ 5 = 4 + cκ x{bmm K 1 + C 0 C a 0 + D M M K K 1 1 MK K γ 1 1 A =4τ 1/ 3 1 + c 1/ + C 1 τ 1/ 4 1 + c 1/ + C τ 1 + c + C 3 τ 1/ 5 1 + c 3/ + C 4 τ 1 1 + c + 4M BC a 0 + D c + κ x where γ 1 = mi { a 1 1/. Cosider lower boud o { > max 64c + c + 1τ τ3 1 {logp4 {logp 5/3. A.46 Here B D E ad a are to be speciied i dieret cases. 1 Assume that g is L α-hölder or α 1 ad g has bouded support whe α > 1. Suppose A.45 holds with q = 305s ad that A.46 holds with B = L α where L α is the Lipschitz costat or g L α = L whe = 1 D = E = 0 a = 1. Further assume that λ 4A + A {logp/ 1/ + 8κ xm ζh where { L ζ = max 4 α MM K + MM K Eũ / 1/ 16κ x M + MM K C 0 1/ L αmm K. The uder Assumptios 4-6 7 8-9 14 ad 17 we have β h β 88sλ Ml κ l with probability at least 1 15.81 exp c log p exp c log exp c. Assume that Assumptio 15 holds with α 0 1. Suppose that A.45 holds with q = 305s ad that A.46 holds with B = M g D = M d E = M a ad a = α. Assume λ 4A + A {logp/ 1/ + 8κ xm ζh γ where { M g MM K C α γ 0 + Md ζ = max 4 M ac 1 γ 0 + MM K Eũ C γ 0 / 1/ 16κ x M + MM K C0 1/ Mg MM K C α γ 0 + Md M ac 1 γ 0 1/ γ = α i M d M a = 0 ad γ = mi { α 1/ i otherwise. The uder Assumptios 4-6 7 8-9 14 ad 17 we have β h β 88sλ Ml κ l with probability at least 1 15.81 exp c log p exp c log exp c. 3 Assume that Assumptio 15 holds with α [1/4 1. Suppose that A.45 holds with 3

q = 305{s + ζ h γ / logp ad that A.46 holds with B = M g D = M d E = M a ad a = α. Further assume λ 4A + A + Mη {logp/ 1/ + 8MM K Cκ xh where { M g MM K C α γ 0 + Md ζ = max 4 M ac 1 γ 0 + MM K Eũ C γ 0 / 1/ 16κ x M + MM K C0 1/ Mg MM K C α γ 0 + Md M ac 1 γ 0 1/ γ = α i M d M a = 0 ad γ = mi { α 1/ i otherwise. The uder Assumptios 4-6 7 8-9 14 ad 17 we have β h β 88sλ s logp { 88λ Ml + + κ l Ml κ l logp + ζ h γ with probability at least 1.58 exp c log p exp c log exp c. Proo. The result ollows directly rom Corollary 4.11-3. Theorem A.14 Corollary 4.15. Assume that A.45 holds with q = 305s. Assume urther that > 64c + c + 1{logp 4 ad λ 4A + A {logp/ 1/ + 4 M g M K Mκ x 1 + C 0 h where A =8MM K M g C 0 1 + C 0 κ x 1 + c 1/ + C 1 M g M 1/ M 3/ K κ1/ x 1 + C 0 1/ C 5/4 0 K 1/4 1 1 + c 1/ + C MM K M g 1 + C 0 κ x K 1 1 + c 3/ + 4C 3 MM 3/ K M 1/ g 1 + C 0 1/ C 1/ 0 κ x 1 + c + C 4 M K M g C 0 κ x K 1 1 1 + c5/ + MM K M g 1 + C 0 C 0 The uder Assumptios 4-9 16 ad 17 we have β h β 88sλ Ml κ l with probability at least 1 15.81 exp c log p exp c log exp c. Proo. We adopt the ramework as described i Sectio or θ = β Γ 0 θ = L 0 β Γ θ h = L β h Γ h θ = E L β h ad take θ h = β which yields s s ad ρ = 0. We veriy Assumptio by usig results A.19 A.1 A. A.3 ad applyig Lemma 4.1. We veriy Assumptio 3 by applyig Corollary 3.. We complete the proo by Theorem.1. A.15 Supportig lemmas Lemma A.15. Assumptio 5 implies that or ay 0 < a < 3 ad 0 < b < 1 we have + w a Kw dw M K ad sup w b Kw M K. w R Proo o Lemma A.15. For ay 0 < a < 3 we have + { w a + a/3 Kw dw w 3 a/3 Kw dw M K M K 33

where the irst iequality is by Hölder s iequality the secod is by Assumptio 5 ad that a > 0 ad the last is by the act that 0 < a < 3 ad the choice o M K 1. For ay 0 < b < 1 ad ay w R we have w b Kw = { w Kw b Kw 1 b M b KM 1 b K = M K where the irst iequality is by Assumptio 5 ad that 0 < b < 1. Thereore we have obtaied that sup w R w b Kw M K. This completes the proo. Lemma A.16. Assumptio 6 implies that or ay X-measurable uctio ψ : R p R m mappig to a m-dimesioal real space we have { W ψ w z X w W ψ w z X W w w M. A.47 w W sup wz Proo o Lemma A.16. For a uctio F we write df x/dx = F x+ F x where F x+ ad F x are right ad let limits respectively whe F x is discotiuous at x. We irst show that sup wx { W Xw x w W Xw x M. We have FW1 X F W w = 1 =x +xw + w df X 1 x dx x =x +x df W X =x w df X x X=x dfx1. x dx x =x +x df X x By domiated covergece theorem we have W1 W Xw X x = 1 w + w x + x df X 1 x dx x =x +x df W X =x w df X x dfx1 M x dx x =x +x df X x ad W Xw x W1 X 1 w +wx +x df X1 x w dx = x =x +x df W X =x w df X x w dfx1 M. x dx x =x +x df X x Based o the same argumet we have w = F W F W w df X=x Xx which by domiated covergece theorem implies that w = W W Xw x df Xx M ad W w W = Xw x df Xx M w w Also or ay X-measurable uctio ψ we have v 1I{ψx vf W w df X=x Xx v=z F W ψ w =. X=z v 1I{ψx v df Xx v=z 34

By domiated covergece theorem we have v 1I{ψx v W W ψ w z = Xw x df Xx v=z M X v 1I{ψx v df Xx v=z ad W ψ w z X v 1I{ψx v W Xwx w df Xx v=z = M. w 1I{ψx v df Xx v=z Thereore Assumptio 6 implies A.47 v Lemma A.17. Assumptio 9 implies coditioal o W = 0 ad ucoditioally X v is meazero subgaussia with parameter at most κ x v or ay v Rp. Assumptio 10 implies that ũ is mea-zero subgaussia with parameter at most κ u. Proo o Lemma A.17. Observe that X T v ad X T v are idetically distributed ad thus we have E[ X T v = 0. We have that the momet geeratig uctio o X T v is E [ e t X T v = E [ e t X 1 Tv E[XT 1 v [ E e t X Tv+E[XT v e t κ x v where the irst iequality is because X 1 ad X are i.i.d. ad the secod is a applicatio o Assumptio 9. Thereore XT v is mea-zero subgaussia with parameter at most κ x v. Observe that coditioal o W = 0 XT v ad X T v are idetically distributed ad thus we have E[ X T v W = 0. We have that the momet geeratig uctio o X T v coditioal o W = 0 is E [ e t X T v [ W = 0 = E E e t X T v W1 = W W = E [ E { e t X1 Tv E[XT 1 v W 1=W W1 { = W E e t X Tv+E[XT W v W e t κ x v where the secod iequality is because X 1 W 1 ad X W are i.i.d. ad the third is a applicatio o Assumptio 9. Thereore coditioal o W = 0 XT v is mea-zero subgaussia with parameter at most κ x v. Apply the same argumet o u we complete the proo. I our proos we used the ollowig results rom Vershyi 01. Lemma A.18. For mea-zero subgaussia radom variable V with parameter at most κ v we have E[V κ v E[V 4 3κ 4 v PV E[V v 1 exp{ v/κ v or ay v κ v ad that E[e sv se[v e s κ 4 v or s κ v 1. Lemma A.19. Let Z be some subgaussia radom variable with parameter at most κ z. Suppose κ z a/4 or some a > 0. The we have a z df Z z a + 4κ z exp{ a/4κ z. Proo o Lemma A.19. We have F Z z PZ E[Z z/ 1 exp{ z/4κ z or ay 35

z a 4κ z Lemma A.18. By itegratio by parts we have z df Z z = z d { 1 F Z z This completes the proo. a a = z { 1 F Z z a + 1 F Z z dz a exp{ a/4κ z + = a + 4κ z exp{ a/4κ z. a a exp{ z/4κ z dz Proo o Lemma 3.. By Taylor s expasio or some t wh [0 1 we have W E h K Z = Kwz h W Z wh z dw df Z z = Kwz { W Z 0 z + W Zw z w wh dw df Z z twh wh which implies that This completes the proo. [ 1 W E h K Z h E[Z W = 0 W 0 M 1 M E[ Z h. Proo o Lemma 3.3. By Taylor s expasio or some t wh [0 1 we have E h KW 1 W ϕz 1 Z W Z h 1 w = h K W ϕz Z h W1 Z 1 w z dw df Z1 z = Kwϕz Z W1 Z 1 W + wh z dw df Z1 z = Kwϕz Z { W1 Z 1 W z + W 1 Z 1 w z w wh dw df Z1 z W +t wh wh which implies that [ 1 E h K W1 W ϕz 1 Z W Z E [ ϕz 1 Z W Z W 1 = W W1 W h M 1 M E [ ϕz 1 Z Z h. This completes the proo. Lemma A.0. Let D i = X i V i W i be i.i.d. or i = 1... ad K be a positive kerel uctio such that Kw dw = 1 ad that max { + w Kw dw sup w R Kw M K or positive absolute costat M K. Assume that coditioal o W i = w or ay w i the rage o W i ad ucoditioally X i ad V i are subgaussia with parameters at most κ x ad κ v respectively or positive absolute costats κ x ad κ v. Assume that there exists positive absolute costat M 36

such that { W XV w x v max w W XV w x v M or ay w x v R such that the desities are deied. Take h K 1 {logp/ 1/ or positive absolute costat K 1 ad assume that h C 0 or positive costat C 0. Suppose > max { 64c+ c + 1{logp 3 /3 3 or positive absolute costat c. Cosider U-statistic U = i<j { 1 h K Wi W j X i X j V i V j h. The we have where { 1 P U E[U { logp 1/ C 6.77 exp{ c + 1 log p C ={16 31 + c 1/ M + 4 3C 1 1 + c 1/ M 1/ K 1/ 1 + 8C 1 + c + 8C 3 1 + c 3/ M 1/ K M 1/ K 1/ 1 + 8C 4 1 + c M K K 1 1 + 8M c + κ x κ v with C 1... C 4 as deied i 3.4 ad M = M + MM K C 0. A.48 Proo o Lemma A.0. Deote Z ij = X i X j V i V j. We apply trucatio to X i X j at level Cx logp ad to V i V j at level Cy logp or some positive absolute costats C x ad C v. Deote A [ = { X i X j Cx logp V i V j C v logp i j [ i < j ad irst ocus o U-statistic Ũ = i<j h K Wi W j Z ij 1I{X i X j Cx logp V i V j C v logp. h Deote gd i D j = 1 Wi W j K Z ij 1I{X i X j Cx logp V i V j C v logp h h ad D i = E [ gd i D j D i. Assume h K 1 {logp/ 1/ or some positive absolute costat K 1. Deote X = X 1 X Ṽ = V 1 V ad W = W 1 W. Note that by argumet o Lemma A.16 we have all the ecessary smooth coditios o desities. Deote C = C x C v ad ote that X i X j Cx logp V i V j C v logp implies that Z ij C logp. Step I. We boud B g B E [ D σ ad B as i Lemma 3.4 ad apply Lemma 3.4. We have B g CM K logp/h CM K /K 1 { logp 1/. For B apply Lemma 3.3 o ϕ = 1 ad with M 1 = M M = M K ad we have Wi W j B C logp E K W j h h C logp{ W W j + MM K C 0 CM logp where M = M + M K MC 0 ad the last iequality used the act that W W j [0 M. 37

For boudig E [ D apply Lemma 3.3 o ϕ = Z ij 1I { X i X j C x logp V i V j C v logp ad with M 1 = M M = M K ad the we have where Thereore we have ad meawhile D 1 D M K M D h 1 D E [ Z 1 1I Z 1 C logp W 1 = W D W1 W D E [ Z 1 1I Z 1 C logp D. E [ D = E [ {D 1 D + 1 D M KM C 0E [ D + E [ 1 D E [ 1 D M E[Z 1 M E[ X 4 1/ E[Ṽ 4 1/ 1M κ xκ v ad E [ D E[Z 1 E[ X 4 1/ E[Ṽ 4 1/ 1κ xκ v. A.49 A.50 where the irst iequalities are by Jese s iequality the secod are by Cauchy-Schwarz iequality ad the third are due to the act that E[ X 4 1κ x E[Ṽ 4 1κ v Lemma A.18. Combiig A.49 ad A.50 we have E [ D M k M C 0 + M 4κ xκ v < 4M κ xκ v. A.51 For boudig σ apply Lemma 3. o Z = Zij ad with M 1 = M M = M K ad the we have E [ gd i D j M [ K 1 Wi W j E K Zij h h h M K { E[Z h ij W i = W j M + MM K C 0 E[Zij M K h {E[ X 4 W = 0 1/ E[Ṽ 4 W = 0 1/ M + MM K C 0 E[ X 4 1/ E[Ṽ 4 1/ M K {1κ h xκ vm + 1κ xκ vmm K C 0 1κ xκ vm K M 1/ K 1 logp where the third iequality is by Cauchy-Schwarz iequality ad the ourth is due to subgaussiaity o X ad Ṽ both coditioal o W = 0 ad ucoditioally. For boudig B we have B = sup E [ gd 1 D D D M K h sup E D K h M K h C logp E h K C M M K K 1 { logp 3/ W1 W Z1 1I { Z 1 C logp D h W1 W h D 38