Supplement to Adaptive Estimation of High Dimensional Partially Linear Model

Similar documents
Rafa l Kulik and Marc Raimondo. University of Ottawa and University of Sydney. Supplementary material

SELECTING THE NUMBER OF CHANGE-POINTS IN SEGMENTED LINE REGRESSION

5. Best Unbiased Estimators

Lecture 9: The law of large numbers and central limit theorem

Sequences and Series

Summary. Recap. Last Lecture. .1 If you know MLE of θ, can you also know MLE of τ(θ) for any function τ?

Solutions to Problem Sheet 1

EXERCISE - BINOMIAL THEOREM

x satisfying all regularity conditions. Then

Math 312, Intro. to Real Analysis: Homework #4 Solutions

ASYMPTOTIC MEAN SQUARE ERRORS OF VARIANCE ESTIMATORS FOR U-STATISTICS AND THEIR EDGEWORTH EXPANSIONS

Asymptotics: Consistency and Delta Method

r i = a i + b i f b i = Cov[r i, f] The only parameters to be estimated for this model are a i 's, b i 's, σe 2 i

Parametric Density Estimation: Maximum Likelihood Estimation

18.S096 Problem Set 5 Fall 2013 Volatility Modeling Due Date: 10/29/2013

The Limit of a Sequence (Brief Summary) 1

STAT 135 Solutions to Homework 3: 30 points

4.5 Generalized likelihood ratio test

1 Estimating sensitivities

The Valuation of the Catastrophe Equity Puts with Jump Risks

14.30 Introduction to Statistical Methods in Economics Spring 2009

Maximum Empirical Likelihood Estimation (MELE)

N a.. n o s.. c a l e.. S.. y.. s t e.. m.. s.. M.. M.. T.. A bullet.. I SSN : hyphen 3290 \ centerline

Introduction to Probability and Statistics Chapter 7

Combining imperfect data, and an introduction to data assimilation Ross Bannister, NCEO, September 2010

Monetary Economics: Problem Set #5 Solutions

point estimator a random variable (like P or X) whose values are used to estimate a population parameter

Exam 1 Spring 2015 Statistics for Applications 3/5/2015

Chapter 8. Confidence Interval Estimation. Copyright 2015, 2012, 2009 Pearson Education, Inc. Chapter 8, Slide 1

Today: Finish Chapter 9 (Sections 9.6 to 9.8 and 9.9 Lesson 3)

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 2

Statistics for Economics & Business

CAUCHY'S FORMULA AND EIGENVAULES (PRINCIPAL STRESSES) IN 3-D

X t. ( t. x 1 (t) x 2 (t) T T. t 2 (u)du. t 1 (u)du T. 0 Xd s ds X

A New Constructive Proof of Graham's Theorem and More New Classes of Functionally Complete Functions

EVEN NUMBERED EXERCISES IN CHAPTER 4

AY Term 2 Mock Examination

5 Statistical Inference

Extended Libor Models and Their Calibration

Extended Libor Models and Their Calibration

IMPA Commodities Course : Forward Price Models

FINM6900 Finance Theory How Is Asymmetric Information Reflected in Asset Prices?

Unbiased estimators Estimators

Game Theory. Lecture Notes By Y. Narahari. Department of Computer Science and Automation Indian Institute of Science Bangalore, India July 2012

Mixed and Implicit Schemes Implicit Schemes. Exercise: Verify that ρ is unimodular: ρ = 1.

IEOR E4703: Monte-Carlo Simulation

ECON 5350 Class Notes Maximum Likelihood Estimation

Chapter 3 Common Families of Distributions. Definition 3.4.1: A family of pmfs or pdfs is called exponential family if it can be expressed as

BOUNDS FOR TAIL PROBABILITIES OF MARTINGALES USING SKEWNESS AND KURTOSIS. January 2008

0.1 Valuation Formula:

Estimation of Population Variance Utilizing Auxiliary Information

SUPPLEMENTAL MATERIAL

Topic 14: Maximum Likelihood Estimation

Average Distance and Vertex-Connectivity

Monopoly vs. Competition in Light of Extraction Norms. Abstract

Notes on Expected Revenue from Auctions

11.7 (TAYLOR SERIES) NAME: SOLUTIONS 31 July 2018

Probability and statistics

Chapter 14. The Multi-Underlying Black-Scholes Model and Correlation

Asymptotic results discrete time martingales and stochastic algorithms

arxiv: v3 [math.st] 3 May 2016

AMH4 - ADVANCED OPTION PRICING. Contents

A random variable is a variable whose value is a numerical outcome of a random phenomenon.

Supplemental notes for topic 9: April 4, 6

. (The calculated sample mean is symbolized by x.)

STOCHASTIC INTEGRALS

Fourier Transform in L p (R) Spaces, p 1

Subject CT1 Financial Mathematics Core Technical Syllabus

Risk Neutral Measures

Valuing volatility and variance swaps for a non-gaussian Ornstein-Uhlenbeck stochastic volatility model

The stochastic calculus

Robust Pricing and Hedging of Options on Variance

Riemannian Geometry, Key to Homework #1

AMS Portfolio Theory and Capital Markets

Bayes Estimator for Coefficient of Variation and Inverse Coefficient of Variation for the Normal Distribution

1 The Black-Scholes model

1 Random Variables and Key Statistics

Non-semimartingales in finance

Outline. Populations. Defs: A (finite) population is a (finite) set P of elements e. A variable is a function v : P IR. Population and Characteristics

Sampling Distributions and Estimation

MATH 205 HOMEWORK #1 OFFICIAL SOLUTION

Institute of Actuaries of India Subject CT5 General Insurance, Life and Health Contingencies

GPD-POT and GEV block maxima

LIBOR models, multi-curve extensions, and the pricing of callable structured derivatives

CreditRisk + Download document from CSFB web site:

An Intertemporal Capital Asset Pricing Model

Stochastic Integral Representation of One Stochastically Non-smooth Wiener Functional

Model checks for the volatility under microstructure noise

ad covexity Defie Macaulay duratio D Mod = r 1 = ( CF i i k (1 + r k) i ) (1.) (1 + r k) C = ( r ) = 1 ( CF i i(i + 1) (1 + r k) i+ k ) ( ( i k ) CF i

1 + r. k=1. (1 + r) k = A r 1

FOUNDATION ACTED COURSE (FAC)

Lecture 4: Parameter Estimation and Confidence Intervals. GENOME 560 Doug Fowler, GS

Bootstrapping high-frequency jump tests

On the pricing equations in local / stochastic volatility models

Saddlepoint Approximation Methods for Pricing. Financial Options on Discrete Realized Variance

DENSITY OF PERIODIC GEODESICS IN THE UNIT TANGENT BUNDLE OF A COMPACT HYPERBOLIC SURFACE

A DOUBLE INCREMENTAL AGGREGATED GRADIENT METHOD WITH LINEAR CONVERGENCE RATE FOR LARGE-SCALE OPTIMIZATION

arxiv: v4 [math.pr] 23 Nov 2015

Particle methods and the pricing of American options

Stock Loan Valuation Under Brownian-Motion Based and Markov Chain Stock Models

Transcription:

Supplemet to Adaptive Estimatio o High Dimesioal Partially Liear Model Fag Ha Zhao Re ad Yuxi Zhu May 6 017 This supplemetary material provides the techical proos as well as some auxiliary lemmas. For almost all proo subsectios i Sectio A we irst restate the target theorem or lemma with more explicit depedece amog all relevat costats ad the provide the details o its proo. A1 Additioal otatio We write B p = { x R p : x 1 ad S p 1 = { x R p : x = 1. Let e j R p be a vector that has 1 at the j-th positio ad 0 elsewhere. A Techical proos A.1 Proo o Theorem.1 Proo. By.1 we have θ h θ ρ. So it suices to show that θ h θ h 9 s λ /κ 1 holds with probability at least 1 ɛ 1 ɛ wheever λ κ 1 r/3 s 1/. We split the rest o the proo ito two mai steps. Step I. Deote = θ h θ h. Recall deiitio o sets S ad C S ad urther deie uctio F = Γ θ h + h Γ θ h h + λ θ h + 1 θ h 1. For the irst step we show that i F > 0 or all C S { R p : = η the η. To this ed we irst show that C S. A.1 Departmet o Statistics Uiversity o Washigto Seattle WA 98195 USA; e-mail: agha@uw.edu Departmet o Statistics Uiversity o Pittsburgh Pittsburgh PA 1560 USA; email: zre@pitt.edu Departmet o Biostatistics Johs Hopkis Uiversity Baltimore MD 105 USA; e-mail: yuzhu@jhsph.edu 1

Applyig triagle iequality ad some algebra we obtai θ h + 1 θ h 1 Sc 1 S 1. We also have with probability at least 1 ɛ 1 Γ θ h + h Γ θ h h Γ θ h h A. Γ θ h h 1 λ S 1 + Sc 1 A.3 where the irst iequality is by covexity o Γ θ h i θ as assumed i Assumptio 3 the secod is by Hölder s iequality ad the last is by Assumptio. Combiig A. ad A.3 ad usig the act that F 0 we have 0 λ Sc 1 3 S 1 thus provig A.1. Next we assume that > η. The because C S ad C S is star-shaped there exists some t 0 1 such that t C S { R p : = η. However by covexity o F Ft tf + 1 tf0 = tf 0. By cotradictio we complete the proo o the irst step. Step II. For the secod step we show that uder Assumptios 1-3 we have F > 0 or all C S { R p : = η or some appropriately chose η ad the complete the proo. Combiig Assumptios 3 ad A. or ay C S { R p : = η where we take η = 3 s 1/ λ /κ 1 ad λ κ 1 r/3 s 1/ so that η r we have that with probability at least 1 ɛ 1 ɛ F Γ θ h h + κ 1 + λ θ h + 1 θ h 1 Γ θ h h 1 + κ 1 + λ Sc 1 S 1 λ 1 / + κ 1 + λ Sc 1 S 1 κ 1 3λ s 1/ / where the irst iequality is by Assumptio 3 the secod is by Hölder s iequality ad A. the third is by Assumptio ad the last is due to the act that S 1 s 1/ S s 1/. The we have F κ 1 η 3 s 1/ λ η/ = 9 s λ /κ 1 > 0 which usig result rom Step I implies that η = 9 s λ /κ 1. Combiig with Assumptio we have θ h θ 18 s λ κ + ρ 1 with probability at least 1 ɛ 1 ɛ. This completes the proo o Theorem.1.

A. Proo o Theorem 3.1 I the sequel with a slight abuse o otatio we use a equivalet represetatio o Assumptio 1 or writig P { U k E[U k A{logp/ 1/ or all k [p 1 ɛ to replace 3. otig that we assume p >. Hereater we also slight abuse o otatio ad do ot distiguish logp/ rom log p/. Theorem A.1 Theorem 3.1. Assume Assumptio 11 holds with γ = 1. Further assume h K 1 {logp/ 1/ or positive absolute costat K 1 ad assume h C 0 or positive costat C 0. We also take λ 4A + A {logp/ 1/ + 8κ xm ζh where A ={16 31 + c 1/ M + 4 3C 1 1 + c 1/ M 1/ K 1/ 1 + 8C 1 + c + 8C 3 1 + c 3/ M 1/ K M 1/ K 1/ 1 + 8C 4 1 + c M K K 1 1 + 8M c + κ x κ u. or positive absolute costat c M = M + MM K C 0 ad C 1... C 4 as deied i 3.4. Suppose we have { > max 64c + c + 1{logp 3 /3 3 48 6M K κ xq 10 K 1 p{logp 1/ 6 6M κ xq /3 144κ 4 x p K1 p logp [ 11 6 3 + c 1/ C 1 M 1/ K 1/ K M 1/ κ x 4/3 q 4/3 {logp 1/3 1 [ 8 6 0 + 7.5c + cc M κ 1/ x q 1/ logp [ 8 6c + 3/ C 3 {144 + c M K M κ 4 xk1 1 + 19M κ4 x + 8M κ 4 x 1/ 4 5 q 4 9 5 {logp [ 10 6 6 + c 3 C 4 κ /3 x q /3 {logp 5/3 K 1 11 6 0 + 7.5cc + M κ x q{logp 6 3q 0 + 7.5cM κ x logp 0 {3M κ x + M MK C 0 κ x Mκ x 6ep 16 q log q 4 K1 M MK κ x logp where q = 305s. The uder Assumptios 4-1 we have β h β 88sλ Ml κ l with probability at least 1 1.54 exp c log p exp c ɛ p where c = κ l M l 64 /6 {3M κ x + M MK C 0 κ x Mκ x. Proo. See Proo o Theorem 3.. 5 3

A.3 Proo o Theorem 3. Theorem A. Theorem 3.. Assume Assumptio 11 holds with a geeral γ 0 1. Further assume h K 1 {logp/ 1/ or positive absolute costat K 1 ad assume h C 0 or positive costat C 0. We also take λ 4A + A {logp/ 1/ + 8κ xm ζh γ where A ={16 31 + c 1/ M + 4 3C 1 1 + c 1/ M 1/ K 1/ 1 + 8C 1 + c + 8C 3 1 + c 3/ M 1/ K M 1/ K 1/ 1 + 8C 4 1 + c M K K 1 1 + 8M c + κ x κ u. or positive absolute costat c M = M + MM K C 0 ad C 1... C 4 as deied i 3.4. Suppose we have { > max 64c + c + 1{logp 3 /3 3 48 6M K κ xq 10 K 1 p{logp 1/ 6 6M κ xq /3 144κ 4 x p K1 p logp [ 11 6 3 + c 1/ C 1 M 1/ K 1/ K M 1/ κ x 4/3 q 4/3 {logp 1/3 1 [ 8 6 0 + 7.5c + cc M κ 1/ x q 1/ logp [ 8 6c + 3/ C 3 {144 + c M K M κ 4 xk1 1 + 19M κ4 x + 8M κ 4 x 1/ 4 5 q 4 9 5 {logp [ 10 6 6 + c 3 C 4 κ /3 x q /3 {logp 5/3 K 1 11 6 0 + 7.5cc + M κ x q{logp 6 3q 0 + 7.5cM κ x logp 0 {3M κ x + M MK C 0 κ x Mκ x 6ep 16 q log q 4 K1 M MK κ x logp where q = 305s. The uder Assumptios 4-1 we have β h β 88sλ Ml κ l with probability at least 1 1.54 exp c log p exp c ɛ p where c = κ l M l 64 /6 {3M κ x + M MK C 0 κ x Mκ x. Proo. We adopt the ramework as described i Sectio or θ = β Γ 0 θ = L 0 β Γ θ h = L β h Γ h θ = E L β h ad take θ h = β which yields s s ad ρ = 0. 5 4

I additio to 3.1 deote 1 U 1k = i<j 1 U k = ad observe that i<j 1 Wij K Xijk ũ ij h h 1 Wij K Xijk XT h h ij βh β k L β { U 1k E[U 1k + U k E[U k + E[U k A.4 where U k is deied i 3.1. Apply Lemma A.0 o D i = X ik u i W i with coditios o lemma satisied by Assumptios 5 6 9 ad 10 ad the we have P{ U 1k E[U 1k A {logp/ 1/ 6.77 exp{ c + 1 log p A.5 or positive absolute costat c ad A as deied i A.48 ad whe > max { 16c + c + 1{logp 3 /3 3. Apply Lemma 3. o Z = X ijk XT ij β h β with coditios o lemma satisied by Assumptios 5 6 9 ad 11 ad the we have E[U k E [ X ijk XT ij β h β W = 0 M + MMK C 0 E [ X ijk XT ij β h β κ xm + MM K C 0 ζh γ. Combiig A.4-A.6 ad Assumptio 1 we have P { or ay k [p k L β A + A {logp/ 1/ + 4κ xm + MM K C 0 ζh γ 1 6.77 exp c log p p ɛ A.6 or positive absolute costat c ad whe we appropriately take bouded rom below. Assume λ 4A + A {logp/ 1/ + 8κ xm + MM K C 0 ζh γ which veriies Assumptio. We veriy Assumptio 3 by applyig Corollary 3. ad complete the proo by Theorem.1. A.4 Proo o Theorem 3.3 Theorem A.3 Theorem 3.3. Assume Assumptio 11 holds with a geeral γ [1/4 1. Further assume h K 1 {logp/ 1/ or positive absolute costat K 1 ad assume h C 0 or positive costat C 0. We also take λ 4A + A + Mη {logp/ 1/ + 8MM K C 1/ κ xh where A ={16 3M 1 + c 1 + 4 3C1 M 1/ K 1 1 1 + c 1 + 8C 1 + c + 8C 3 M 1 K M 1 K 1 1 1 + c 3 + 8C 4 M K K 1 1 1 + c + 8M c + κ x κ u + Cκ x η = E [ X XT W = 0. 5

Here C 1... C 4 are as deied i 3.4 C > ζ C γ 0 ad c > 0 are some absolute costats ad M = M + MM K C 0. Suppose we have { > max C ζ C γ 0 s logp 64c + c + 1{logp 3 /3 3 48 6M K κ xq 10 K 1 p{logp 1/ 6 6M κ xq /3 144κ 4 x p K1 p logp [ 11 6 3 + c 1/ C 1 M 1/ K 1/ K M 1/ κ x 4/3 q 4/3 {logp 1/3 1 [ 8 6 0 + 7.5c + cc M κ 1/ x q 1/ logp [ 8 6c + 3/ C 3 {144 + c M K M κ 4 xk1 1 + 19M κ4 x + 8M κ 4 x 1/ 4 5 q 4 9 5 {logp [ 10 6 6 + c 3 C 4 κ /3 x q /3 {logp 5/3 K 1 11 6 0 + 7.5cc + M κ x q{logp 6 3q 0 + 7.5cM κ x logp 0 {3M κ x + M MK C 0 κ x Mκ x 6ep 16 q log q 4 K1 M MK κ x logp where q = 305{s + ζ h γ / logp. The uder Assumptios 4-6 8-13 we have β h β 88sλ s logp { 88λ Ml + + κ l Ml κ l logp + ζ h γ with probability at least 1 19.31 exp c log p exp c ɛ p where c = κ l M l 64 /6 {3M κ x + M MK C 0 κ x Mκ x. Proo. We adopt the ramework as described i Sectio or θ = β Γ 0 θ = L 0 β Γ θ h = L β h Γ h θ = E L β h. We take θ h = β h such that or each j [p The uder Assumptio 11 we have β h j = { β hj i β h j > {logp/1/ ; 0 i otherwise. ρ s logp/ + ζ h γ s s + ζ h γ logp. 5 A.7 A.8 We veriy Assumptio by applyig Lemma A.4 below with A = A + A veriy Assumptio 3 by applyig Corollary 3. uder Assumptio 13 ad complete the proo by Theorem.1. 6

Lemma A.4. Assume h K 1 {logp/ 1/ or positive absolute costat K 1 ad assume h C 0 or positive costat C 0. Deote η = E [ X XT W = 0. We also take λ 4A + A + A + Mη {logp/ 1/ + 8MM K C 1/ κ xh where A ad A are as speciied i A.48 ad C > ζ C γ 0 is some positive absolute costats. Suppose we have > max { C ζ C γ 0 s logp 64c + c + 1{logp 3 /3 3 or positive absolute costat c > 0. The uder Assumptios 4-6 8-1 we have P k L β h h λ or all k [p 1 13.54 exp c log p ɛ p. Proo o Lemma A.4. I additio to 3.1 deote 1 1 U 1k = K h i<j 1 U k = i<j 1 U 3k = i<j Wij h Xijk ũ ij 1 Wij K Xijk XT h h ij β β h 1 Wij K Xijk XT h h ij βh β h ad observe that k L β h h U 1k E[U 1k + U k E[U k + U k E[U k + E[U 3k A.9 where i decomposig the let had side we have utilized the act that E[ k L β h h = 0. Result o A.0 holds thus boudig U 1k E[U 1k i.e. P { U 1k E[U 1k A {logp/ 1/ 6.77 exp{ c + 1 log p. A.10 We boud the rest o the compoets o the right had side o the last display. We have β β h s logp/+ζ h γ < C or some positive absolute costat C > ζ C γ 0 whe > C ζ C γ 0 s logp. Apply Lemma A.0 o D i = X ik Xi Tβ β h W i with coditios o lemma satisied by Assumptios 5 6 9 ad that β β h < C ad we have P { U k E[U k A {logp/ 1/ 6.77 exp{ c + 1 log p A.11 or positive costats A ad c ad whe we assume > max { 64c + c + 1{logp 3 /3 3. Here A is as speciied i A.48. Apply Lemma 3.3 with coditios o lemma satisied by Assumptios 5 Lemma A.15 ad 6 Lemma A.16 ad we have E[U 3k ME [ X ijk XT ij β h β h Wij = 0 + MM K h E [ X ijk XT ij β h β h Mη {logp/ 1/ + MM K C 1/ κ xh A.1 where the secod iequality is due to Cauchy-Schwarz ad Assumptio 9 Lemmas A.17 ad A.18. 7

Combiig A.9-A.1 ad Assumptio 1 we have P { or ay k [p k L β h h { A + A + A + Mη { logp 1/ + 4MM K C 1/ κ xh 1 13.54p exp{ c + 1 log p ɛ p or positive absolute costat c ad whe we appropriately take bouded rom below. Here A ad A are as speciied i A.48. Assume λ 4A +A +A+Mη {logp/ 1/ +8MM K C 1/ κ xh. This completes the proo. A.5 Proo o Theorem 3.4 Theorem A.5 Theorem 3.4. Assume h C 0 or positive costat C 0 ad that h 4MM K κ x 1. Uder Assumptios 4-6 7 8-9 ad 14 ad whe g is L α-hölder or α 1 g has bouded support whe α > 1 we have where { ζ = max 4 L α MM K + MM K Eũ / β h β ζh where L α is the Lipschitz costat or g L α = L whe α = 1. 1/ 16κ x M + MM K C 0 1/ L αmm K Proo. Reer to Proo o Theorem 3.5 whe g is L 1-Hölder takig M g = L ad M d = M a = 0 i which case Assumptio 15 is ot eeded. Note that higher-order Hölder with compact support implies L 1-Hölder. Thus we complete the proo. A.6 Proo o Theorem 3.5 Theorem A.6. Assume h C 0 or positive costat C 0 ad that h 4MM K κ x 1. Uder Assumptios 4-6 7 8-9 ad 14-15 we have where β h β ζh γ { M g MM K C α γ 0 + Md ζ = max 4 M ac 1 γ 0 + MM K Eũ C γ 0 / 1/ 16κ x M + MM K C0 1/ Mg MM K C α γ 0 + Md M ac 1 γ 0 1/ γ = α i M d M a = 0 ad γ = mi { α 1/ i otherwise. Proo o Theorem 3.5. We prove the lemma i three steps. Step I. We show that L 0 βh L 0β is lower bouded or L 0 β = E [ Ỹ X T β W = 0 0. By Assumptios 8 ad 7 W we have L 0 β [ λ mi β = λ mi E X XT W = 0 W 0. 8

Thereore or some β t = β h + tβ β h t [0 1 we have L 0 β h L 0β = 1 β h β T L 0 β β β=βt β h β β h β. Step II. We show that L h β L 0 β is upper bouded. Observe that [ 1 L h β L 0 β E h K W { h X T β β E [ { X T β β W = 0 W 0 + E h K Wij {gwi gw j h [ 1 + E h K W ũ E [ ũ W = 0 h W 0 [ 1 + E h K Wij XT β β { gw i gw j. h Ad we boud each compoet o the right had side o above iequality. By Taylor s expasio we have [ 1 E h K W { h X T β β E [ { X T β β W = 0 W 0 1 w = h K v h W XT β β w v dw df XT β β v = = v W XT β β 0 v df XT β β v + W XT β β Kwv { W XT β β wh v W XT β β 0 v dw df XT β β v Kwv { W XT β β w v wh w 0v w v τwhv w h dw df XT β β v w A.13 where because W X T β β ad W X T β β are idetically distributed we have Kwv { W XT β β w v wh dw df w 0v XT β β v = Kwv { W XT β β w v + W w v XT β β wh dw df w 0v w 0 v XT β β v =0. 0 Thereore usig Assumptios 5 6 Lemmas A.15 ad A.16 ad 14 we urther have [ 1 E h K W { XT β β E [{ h XT β β W = 0 W 0 = Kwv { W w v XT β β τwhv w w h dw df XT β β v MM K E [ { X T β β h MM K κ x β β h. A.14 9

Usig a idetical argumet by Assumptios 5 6 Lemmas A.15 ad A.16 ad iite secod momet assumptio E[ũ < we have [ 1 E h K W ũ E[ũ W = 0 h 0 W MM K E[ũ h. A.15 By Assumptio 15 we have E h K Wij {gwi gw j h Mg E h K W W α + Md h E h K Wij 1I { W i W j A h Mg E h K W W α + Md h M ah where E h K W W α = Kw w α h α wh dw MM h W K h α. Thereore we have E h K Wij {gwi gw j Mg MM K h α + Md h M ah. A.16 By A.14 A.16 ad applyig Hölder s iequality we also have [ 1 E h K Wij XT h ij β β { gw i gw j E h K Wij { XT h ij β β 1/ E h K Wij {gwi gw j 1/ A.17 h MM K κ x β β h + κ x β β M 1/ M g MM K h α + Md M ah 1/ a 1 β β h γ where γ = α i M d M a = 0 ad γ = mi { α 1/ i otherwise ad a 1 = κ x M + MM K C0 1/ Mg MM K C α γ 0 + Md M ac 1 γ 0 1/. Combiig A.13-A.17 we have L h β L 0 β a 1 β β h γ + a h γ + a 3 β β h where a = M g MM K C α γ 0 + M d M ac 1 γ 0 + MM K Eũ C γ 0 ad a 3 = MM K κ x. Step III. We combie Step I ad Step II ad veriy Assumptio 11. Usig results rom Step I ad Step II we have β h β L 0 β h L 0β Whe h /a 3 we have = L 0 β h L hβ h + L hβ L 0 β + L h β h L hβ L 0 β h L hβ h + L hβ L 0 β a 1 β h β h γ + a h γ + a 3 β h β h. β h β 4a 1 β h β h γ + 4a h γ 10

which urther implies that This completes the proo. A.7 Proo o Corollary 3.1 { βh 8a 1/ 8a 1 β max h γ. Corollary A.1 Corollary 3.1. Assume h K 1 {logp/ 1/ or positive absolute costat K 1 ad assume h C 0 or positive costat C 0. We deote c to be some positive absolute costat c = κ l M l 64κ lm l /6 {3M κ x + M MK C 0 κ x Mκ x M = M + MM K C 0 ad C 1... C 4 as deied i 3.4 Also deote ad τ 1 = + c 1/ κ x K 1 1 BM KC a 0 + DM K τ = + c 1/ κ x {BM K M1 + C 0 C a 0 + DM τ 3 = 4M KM BC a 0 + D 1 + C 0 κ x τ 4 = { 4B MM K κ x1 + C 0 C a γ 1 0 + D 1M κ 4 x 1/ E 1/ C 1/ γ 1 0 τ 5 = 4 + cκ x{bmm K 1 + C 0 C a 0 + D M M K K 1 1 MK K γ 1 1 A ={16 3M 1 + c 1 + 4 3C1 M 1 K 1 1 1 + c 1 + 8C 1 + c + 8C 3 M 1 K M 1 K 1 1 1 + c 3 + 8C 4 M K K 1 1 1 + c + 8M c + κ x κ u + Cκ x A =4τ 1/ 3 1 + c 1/ + C 1 τ 1/ 4 1 + c 1/ + C τ 1 + c + C 3 τ 1/ 5 1 + c 3/ + C 4 τ 1 1 + c + 4M BC a 0 + D c + κ x 11

where γ 1 = mi { a 1 1/. Cosider lower boud o { > max 64c + c + 1{logp 3 /3 64c + c + 1τ τ3 1 {logp4 {logp 5/3 3 48 6M K κ xq 10 K 1 p{logp 1/ 6 6M κ xq /3 144κ 4 x p K1 p logp [ 11 6 3 + c 1/ C 1 M 1/ K 1/ K M 1/ κ x 4/3 q 4/3 {logp 1/3 1 [ 8 6 0 + 7.5c + cc M κ 1/ x q 1/ logp [ 8 6c + 3 C 3 {144 + c M K M κ 4 xk1 1 + 19M κ4 x + 8M κ 4 x 1 4 5 q 4 9 5 {logp [ 10 6 6 + c 3 C 4 κ /3 x q /3 {logp 5/3 K 1 11 6 0 + 7.5cc + M κ x q{logp 6 3q 0 + 7.5cM κ x logp 0 {3M κ x + M M K C 0 κ x Mκ x 16 q log 4 K1 M MK κ x logp. 6ep q 5 A.18 Here q B D E ad a are to be speciied i dieret cases. Suppose that Assumptios 4-6 7 8-10 ad 14 hold. 1 Assume that g is L α-hölder or α 1 ad g has bouded support whe α > 1. Also suppose A.18 holds with q = 305s. We take B = L α where L α is the Lipschitz costat or g L α = L whe = 1 D = E = 0 a = 1 ad assume λ 4A + A {logp/ 1/ + 8κ xm ζh where ζ = max The we have { 4 L α MM K + MM K Eũ / β h β 88sλ Ml κ l with probability at least 1 17.81 exp c log p exp c. 1/ 16κ x M + MM K C 0 1/ L αmm K. Assume that Assumptio 15 holds with α 0 1. Suppose that A.18 holds with q = 305s ad we take B = M g D = M d E = M a ad a = α. Further assume that 1

λ 4A + A {logp/ 1/ + 8κ xm ζh γ where { M g MM K C α γ 0 + Md ζ = max 4 M ac 1 γ 0 + MM K Eũ C γ 0 / 1/ 16κ x M + MM K C0 1/ Mg MM K C α γ 0 + Md M ac 1 γ 0 1/ where γ = α i M d M a = 0 ad γ = mi { α 1/ i otherwise. The we have β h β 88sλ Ml κ l with probability at least 1 17.81 exp c log p exp c. 3 Assume that Assumptio 15 holds with α [1/4 1. Suppose that A.18 holds with q = 305{s + ζ h γ / logp ad take B = M g D = M d E = M a ad a = α. Deote C to be some positive absolute costat C > ζ C γ 0 ad suppose C ζ C γ 0 s logp where { M g MM K C α γ 0 + Md ζ = max 4 M ac 1 γ 0 + MM K Eũ C γ 0 / 1/ 16κ x M + MM K C0 1/ Mg MM K C α γ 0 + Md M ac 1 γ 0 1/ where γ = α i M d M a = 0 ad γ = mi { α 1/ i otherwise. Further assume λ 4A + A + Mη {logp/ 1/ + 8MM K C 1/ κ xh. The we have β h β 88sλ s logp { 88λ Ml + + κ l Ml κ l logp + ζ h γ with probability at least 1 4.58 exp c log p exp c. Proo. We prove the corollary or the case whe g is Lipschitz. We veriy Assumptios 11 ad 1 ad the apply Theorem 3.1. Assumptio 11 is veriied by applyig Theorem 3.4 ad Assumptio 1 is veriied by applyig Lemma A.1. We complete the proo by Theorem 3.1. The rest o the corollary ca be proved based o similar argumets. A.8 Proo o Lemma 3.1 Lemma A.7 Lemma 3.1. Assume h K 1 {logp/ 1/ or positive absolute costat K 1 ad assume h C 0 or positive costat C 0. Further assume λ 4A + A {logp/ 1/ + 4 M g M K Mκ x 1 + C 0 h. Here A is as speciied i A.48 ad A as i A.53. Suppose we have > max { 64c + c + 1{logp 3 /3 64c + 3 c + 1{logp 4 {logp 5/3 3 or positive absolute costat c > 0. The uder Assumptios 5 6 ad 9 10 16 we have P k L β h λ or all k [p 1 1.04 exp c log p. 13

Proo. Deote 1 U 1k = i<j 1 U k = i<j 1 Wij K Xijk ũ ij h h 1 Wij { K Xijk gwi gw j h h ad observe that k L β h { U 1k E[U 1k + E[U 1k + U k E[U k + E[U k. A.19 Apply Lemma A.0 o D i = X i u i W i with coditios o lemma satisied by Assumptios 5 6 9 10 we have P U 1k E[U 1k A{logp/ 1/ 6.77 exp{ c + 1 log p A.0 or positive absolute costat A ad c ad whe assumig > max { 64c+ c+1{logp 3 /3 3. Here A is as speciied i A.48. Apply Lemma A.1 o D i = X i gw i W i with coditios o lemma satisied by Assumptios 5 6 9 16 we have P U k E[U k A {logp/ 1/ 5.7 exp{ c + 1 log p A.1 or positive costats A ad c ad whe assumig > max { 64c+ 3 c+1{logp 4 {logp 5 3. Here A is as speciied i A.53. By idepedece o u ad X W we have E[U 1k = 0. We also have Wij E[U k M g E K Xijk Wij h h =M g Kw xwh Wij X w x dw df ijk Xijk x =M g { Kw xwh Wij X 0 x + ijk M g M K ME [ X ijk Wij = 0 h + M g M K ME[ X ijk h M g M K Mκ x 1 + C 0 h A. Wij X w x ijk w wh dw df twhx Xijk x A.3 where the irst iequality is by Assumptio 16 the secod equality is by deiitio the third equality by Taylor s expasio at w = 0 t [0 1 the third iequality is by Assumptios 5 Lemma A.15 ad 6 Lemma A.16 ad the last iequality is by Assumptio 9 Lemma A.17. Combiig A.19-A.3 we have P { or ay k [p k L β h A + A {logp/ 1/ + M g M K Mκ x 1 + C 0 h 1 1.04 exp c log p or positive absolute costat c ad whe we appropriately take bouded rom below. Thus we 14

have completed the proo by otig that λ 4A + A {logp/ 1/ + 4 M g M K Mκ x 1 + C 0 h. A.9 Proo o Theorem 3.6 Theorem A.8 Theorem 3.6. Assume h K 1 {logp/ 1/ or positive absolute costat K 1 ad assume that h C 0 or positive costat C 0. Further assume λ 4A+A {logp/ 1/ + 4 M g M K Mκ x 1 + C 0 h where A ={16 3M 1 + c 1/ + 4 3C 1 M 1/ K 1/ 1 1 + c 1/ + 8C 1 + c + 8C 3 M 1/ K M 1/ K 1/ 1 1 + c 3/ + 8C 4 M K K 1 1 1 + c + 8M c + κ x κ u A =8MM K M g C 0 1 + C 0 κ x 1 + c 1/ + C 1 M g M 1/ M 3/ K κ1/ x 1 + C 0 1/ C 5/4 0 K 1/4 1 1 + c 1/ + C MM K M g 1 + C 0 κ x K 1 1 + c 3/ + 4C 3 MM 3/ K M 1/ g 1 + C 0 1/ C 1/ 0 κ x 1 + c + C 4 M K M g C 0 κ x K 1 1 1 + c5/ + MM K M g 1 + C 0 C 0 or positive absolute costat c M = M + MM K C 0 ad C 1... C 4 as deied i 3.4. Suppose we have { > max 64c + c + 1{logp 3 /3 64c + 3 c + 1{logp 4 {logp 5/3 3 48 6M K κ xq 10 K 1 p{logp 1/ 6 6M κ xq /3 144κ 4 x p K1 p logp [ 11 6 3 + c 1/ C 1 M 1/ K 1/ K M 1/ κ x 4/3 q 4/3 {logp 1/3 1 [ 8 6 0 + 7.5c + cc M κ 1/ x q 1/ logp [ 8 6c + 3 C 3 {144 + c M K M κ 4 xk1 1 + 19M κ4 x + 8M κ 4 x 1 4 5 q 4 5 {logp 9 5 [ 10 6 6 + c 3 C 4 κ x K 1 /3 q /3 {logp 5/3 11 6 0 + 7.5cc + M κ x q{logp 0 {3M κ x + M M K C 0 κ x Mκ x 16 q log 4 K1 M MK κ x logp. where q = 305s. The uder Assumptios 4-10 ad 16 we have 6 3q 0 + 7.5cM κ x logp 6ep q A.4 β h β 88sλ Ml κ l 15

with probability at least 1 17.81 exp c log p exp c where c = κ l M l 64κ lm l /6 {3M κ x + M M KC 0κ x Mκ x. Proo o Theorem 3.6. We adopt the ramework as described i Sectio or θ = β Γ 0 θ = L 0 β Γ θ h = L β h Γ h θ = E L β h ad take θ h = β which yields s s ad ρ = 0. We veriy Assumptio by applyig Lemma 3.1 ad veriy Assumptio 3 by applyig Corollary 3.. We complete the proo by Theorem.1. A.10 Proo o Theorem 3.7 Theorem A.9 Theorem 3.7. For q [p suppose that { 48 6MK κ > max xq 384 K 1 p{logp 1/ 6M κ xq /3 144κ 4 x tp K1 p logp [ 768 3 + c 1/ C 1 M 1/ K 1/ 1 t K M 1/ κ x 4/3 q 4/3 {logp 1/3 [ 960 + 7.5c + cc M κ 1/ x q 1/ logp t 3 [ 96c + C 3 {144 + c M K M κ 4 xk1 1 + 19M κ4 x + 8M κ 4 x 1 t [ 384 6 + c 3 C 4 κ x /3q /3 {logp 5/3 K 1 t 7680 + 7.5cc + M κ x q{logp 1q t 0 + 7.5cM κ xt logp 1 {3M κ x + M MK C 0 κ x Mκ x 6ep t q log 16t q 16 K1 M MK κ x logp t or positive absolute costat t ad c > 1. Uder Assumptios 5 6 ad 9 we have T E T q t 4/5q 4 5 {logp 9 5 A.5 with probability at least 1 5.77 exp c log p exp c where c = t 4t/[ 8 {3M κ x + M M K C 0 κ x Mκ x. Proo. We deote 1 X h = h 1/ Σ h = E K h K 1/ Wij h XT ij W h X XT. p to be a p matrix 16

Ad we aim to show that with high probability 1 v T Xh T X h v v T Σ h v θ v or all v R p v 0 q simultaeously holds or some θ > 0 uder coditios o Theorem 3.7. We split the proo ito three steps. Step I. For set J [p cosider E J S p 1 where E J = spa { e j : j J. Costruct ɛ-et Π J such that Π J E J S p 1 ad Π J 1 + ɛ 1 q. The existece o Π J ca be guarateed by Lemma 3 o Rudelso ad Zhou 013. Deie Π = J =q Π J the or 0 < ɛ < 1 to be determied later we have 3 q p 3ep q { 6ep Π = exp q log. ɛ q qɛ q For ay v E J S p 1 let Πv be the closest poit i ɛ-et Π J. The we have v Πv E J S p 1 ad v Πv ɛ. v Πv Step II. Deote D i = W i X i V i or i [ ad D = W X V to be a i.i.d copy. We upper boud { 1 P g v D i D j µ v θ max v Π or some θ > 0 where g v D i D j = 1 Wij K h h X ijv T ad µ v = E[g v D i D j. Also deote v D i = E [ g v D i D j Di. Observe that 1 g v D i D j µ v i<j 1 { { gv D i D j v D i v D j + µ v + v D i µ v. i<j i<j We boud two compoets o the right had side o iequality above separately ad the combie the result. Step II.1. We boud 1 P i=1 i=1 { v D i µ v t A.6 or t > 0 to be determied ad or each v E J S p 1. Apply Lemma 3.3 with coditios o lemma satisied by Assumptios 5 Lemma A.15 ad 6 Lemma A.16 ad we have v D i 1 D i MM K h D i A.7 where 1 D i = E [ X T ij v Wij = 0 D i W W i ad D i = E [ X T ij v X i. Also we have µ v µ 1 MM K h µ A.8 where µ 1 = E[ X T ij v W ij = 0 W 0 ad µ = E[ D i = E[ X T ij v. Ad we boud A.6 as 17

below. We have 1 { P v D i µ v t i=1 =P { e a i=1 vd i µ v e at e at E [ { e a i=1 vd i µ v e at E [ e a [ i=1 { 1 D i µ 1 +MM K h { D i µ e MM K h µ a e at E [ e a i=1 { 1D i µ 1 1/ [ E e MM K C 0 a i=1 { D i µ 1/ e 4κ xmm K h a e at E [ e Ma i1 E[ XT ij v W ij =0D i E[ X ij T v W ij =0 1/ [ E e a κ x i=1 W W i E[ W W i 1/ E [ e MM KC 0 a i=1 { D i µ 1/ e 4κ x MM Kh a [ e at E [ e am i=1 E [ e MM KC 0 a i=1 { X i X i T v E[ X ij T v W ij =0 W i = W i {Xi X i T v µ 1/ e 4κ x MM Kh a e at e M κ 4 x a e M κ 4 x a e M M K C 0 κ4 x a e 4MM Kκ x ha 1/ e a κ 4 x M 1/ or 0 < a 4Mκ x 1 where the irst iequality is by Markov s the secod is a applicatio o A.7 ad A.8 the third is by Cauchy-Schwarz ad the result that µ κ x Assumptio 9 Lemma A.17 ad Lemma A.18. The ourth iequality is by otig that W 0 = E[ W W i ad applyig the ollowig iequality V 1 V E[V 1 E[V V 1 E[V 1 V + E[V 1 V E[V where V 1 = E[ X ij Tv W ij = 0 D i E[V 1 κ x by Assumptio 9 Lemma A.17 ad Lemma A.18 ad V = W W i [0 M. For the ith iequality the secod compoet i product is bouded due to Jese s iequality where X i W i i = 1... are idepedet copies o X i W i ; the third is bouded because W W i [0 M ad E[ X ij Tv W ij = 0 κ x by Assumptio 9 Lemma A.17 ad Lemma A.18. The sixth iequality is agai a applicatio o Assumptio 9 Lemma A.17 ad Lemma A.18. Take a = 1 t a 1 1 ad h t 4a 1 where a 1 = M κ 4 x + M MK C 0 κ4 x + M κ 4 x Mκ x ad a = 4MM K κ x. The we urther have 1 { { t t P v D i µ v t exp. 8a 1 i=1 By the same argumet we have 1 { P v D i µ v t i=1 i=1 { t t exp. 8a 1 We take t = θ/4 ad have 1 { θ { θ 4θ P v D i µ v exp. A.9 4 18a 1 18

Step II.. Observe that 1 { 1 { gv D i D j v D i v D j + µ v s max ϕ kl D i D j kl i<j i<j where ϕ kl D i D j = 1 [ Wij 1 Wij K Xijk Xijl E K h h h Wij [ E K Xijk XijlDj 1 + E h h We the boud i<j ϕ kld i D j or each k l [p. h Xijk Xijl D i h K Wij h Xijk Xijl. Apply trucatio X ik E[X ik τ / or each i [ k [p ad τ = 6+c 1 κ x {logp 1 or positive absolute costat c. Deie evets A i = { X ik E[X ik τ k [p A [ = { X ik E[X ik τ i [ k [p. Cosider trucated U-statistic i<j ϕ kld i D j where ϕ kl D i D j = 1 [ Wij 1 Wij K Xijk Xijl 1IA i A j E K Xijk Xijl D i 1IA i h h h h Wij [ E K Xijk Xijl 1 Wij D j 1IA j + E K Xijk Xijl. h h h h First we boud E[ϕkl D i D j. We have E[ϕ kl D i D j [ 1 [ { Wij 1 = E K Xijk Xijl 1IA c i A c Wij h h j E E K Xijk XijlDi 1IA c i h h A.30 [ 1 [ { Wij E K Xijk Xijl 1IA c i A c E 1 Wij h h j + E K Xijk Xijl D i 1IA c. i h h We have [ 1 Wij E K Xijk Xijl 1IA c i A c 1 h h j MK E[ h X X ijk ijl 1/ PA c i A c j 1/ M K 1 h E[ X 4 ijk 1/4 E[ X 4 ijl 1/4 PA c i A c j 1/ M K 1 h 1κ 4 x 1/ p 1 3 p 3 1/ 6M K κ x K 1 p{logp 1/ θ 4q A.31 where the irst ad secod iequalities are by Cauchy-Schwarz the third is by subgaussiaity o X i X j the ourth is by choice o h ad the last holds true whe we have 48 6M K κ xq K 1 θ{logp 1/ p. 19

We also have [ { 1 Wij [ { E E K Xijk Xijl D i 1IA c 1 Wij 1/ i E E K Xijk Xijl D i PA c h h h h i 1/ { 4M + M K C 0 κ 4 x 1/ θ 48q 1 3/ p A.3 where the irst ieuqlity is by Cauchy-Schwarz the secod is by A.51 ad subgaussiaity o X i Assumptio 9 ad the last holds true whe we have { 96 6M + MMK C 0 κ xq /3. θp Combiig A.30 A.31 ad A.3 we have E[ϕ kl D i D j θ 1q A.33 whe we appropriately choose bouded rom below. Next we boud i<j ϕ kld i D j by applyig Lemma 3.4. We boud costats i Lemma 3.4 as ollows. For boudig B g we have B g 4M K τ h 1 boudig B we have E [ ϕ kl D i D j D j E K h + E K h E K h Wij h X ijk Xijl 1IA i A j [ { 1 D j + E E Wij h X ijk Xijl D j 1IA j + E h X ijk Xijl 1IA i A j D j + E Wij + E K h Wij h X ijk Xijl. {4 6 + cm K κ x K 1 1 { logp1/. For Wij h X ijk Xijl D i 1IA i K h Wij K h h X ijk Xijl h K Wij h X ijk Xijl D j 1IA j A.34 Apply Lemma 3.3 o ϕ = 1 with M 1 = M ad M = M K as give by Assumptios 6 Lemma A.16 ad 5 Lemma A.15 we have Wij E K h h X ijk Xijl 1IA i A j D i A.35 τtm + MM K C 0 = 6c + M + MM K C 0 κ x logp. Apply Lemma 3.3 o ϕ = X ijk Xijl with M1 = M ad M = M K as give by Assumptios 6 0

Lemma A.16 ad 5 Lemma A.15 we have Wij E K h h X ijk Xijl D j 1IA j M E [ X ijk Xijl Dj W ij = 0 1IA j + MM K C 0 E [ X ijk Xijl Dj 1IAj ME[ X ijk D j W ij = 0 1/ E[ X ijl D j W ij = 0 1/ 1IA j + MM K C 0 E[ X ijk D j 1/ E[ X ijl D j 1/ 1IA j 1.5c + 4 M + MM K C 0 κ x logp where the secod iequality is by Cauchy Schwarz ad the last is due to E[ X ijk Dj 1IA j = { E[X ik E[X ik + X ik E[X jk 1IA j κ x + τ /4 1.5c + 4κ x logp ad based o a idetical argumet E[ X ijk D j W ij = 0 1IA j 1.5c + 4κ x logp A.36 or ay k [p. Apply Lemma 3. o Z = X ijk Xijl ad with M 1 = M M = M K as give by Assumptios 6 Lemma A.16 ad 5 Lemma A.15 we have Wij E K h h X ijk Xijl M + MM K C 0 κ A.37 x Combiig A.34-A.37 we have B 0 + 7.5c M + MM K C 0 κ x logp. For boudig E [ E { ϕ kl D i D j Dj we observe that ϕ kl D i D j = 1 [ Wij 1 Wij K Xijk Xijl 1IA i A j E K Xijk Xijl 1IA i A j D i h h h h Wij E K Xijk Xijl 1IA i A j Wij D j + E K Xijk Xijl 1IA i A j h h h h Wij + E K Xijk Xijl 1IA i A j Wij D i E K Xijk Xijl D i 1IA i h h h h Wij + E K Xijk Xijl 1IA i A j Wij D j E K Xijk Xijl D j 1IA j h h h h [ Wij 1 Wij + E K Xijk Xijl E K Xijk Xijl 1IA i A j h h h h which urther implies that E [ ϕ kl D i D j D j [ 1 [ Wij E K Xijk Xijl 1IA c E 1 Wij h h j + K Xijk Xijl 1IA c i D j h h Wij + E K Xijk Xijl 1IA c i A c h h j 1

Thereore we have E [ E { ϕ kl D i D j D j 3 { E[ h Xijk Xijl 1IA c j + E[ X ijk Xijl 1IA c i + E[ X ijk Xijl 1IA c i A c j 3 { E[ K1 logp X4 ijk 1/ E[ X ijl 4 1/ PA c j + E[ X ijk 4 1/ E[ X ijl 4 PAc i A c j 3 K1 logp 1 1κ4 x 3 p + 1κ4 x 3 p 1 where the irst iequality is due to the act that K [0 1 ad by Jese s iequality the secod is by Cauchy-Schwarz the third by subgaussiaity o X i X j ad X ij ad last holds true whe we have 144κ 4 x K1 p logp. For boudig σ apply Lemma 3. o Z = X ijk X ijl with M 1 = M ad M = M K as give by Assumptios 6 Lemma A.16 ad 5 Lemma A.15 we have σ 16M [ K 1 Wij E K X h h h ijk X ijl 16M K { [ M E X h ijk X ijl Wij = 0 + MM K C 0 E [ X ijk X ijl 16M { K ME[ h X 4 ijk Wij = 0 1/ E[ X 4 ijl Wij = 0 1/ + MM K C 0 E[ X ijk 4 1/ E[ X ijl 4 1/ 19M KM + MM K C 0 κ 4 { x 1/ K 1 logp where the third iequality is by Cauchy-Schwarz ad the last is by subgaussiaity o X ad choice o h. For boudig B we have B = sup E [ ϕ kl D i D j D j D j 4M K h 1 + 4 sup E D j + 4 sup E D j Wij sup E K X D j h h ijk X ijl 1IA i A j D j [ { 1 E K h [ { 1 E K h Wij + 4E K Xijk Xijl h h Wij Xijk Xijl D i 1IAi h Wij Xijk Xijl D j 1IAj h 4M KM τ 4 + 19M h κ4 x + 8M κ x {144 + c M K M κ 4 xk1 1 + 19M κ4 x + 8M κ 4 x { logp 3/

where M = M + MM K C 0. We take t = θ 1q u = + c log p ad require that { 48 6MK κ > max xq 96 K 1 p{logp 1/ 6M κ xq /3 144κ 4 x θp K1 p logp 9 3 + c 1/ C 1 M 1/ K 1/ 1 θ K M 1/ κ x 4/3 q 4/3 {logp 1/3 [ 40 + 7.5c + cc M κ 1/ x q 1/ logp θ 3 [ 4c + C 3 {144 + c M K M κ 4 xk 1 θ 1 + 19M κ4 x + 8M κ 4 x 1/ [ 96 6 + c 3 C 4 κ x /3q /3 {logp 5/3 K 1 θ 190 + 7.5cc + M κ x q{logp 1q θ 0 + 7.5cM κ xθ logp 4 5 q 3 {logp 9 5 A.38 or some positive absolute costat c ad C 1... C 4 as deied i 3.4. The by Lemma 3.4 we have 1 P ϕ kl D i D j E[ϕ kl D i D j 5θ 1q i<j exp{ + c log p +.77 exp{ + c log p Combied with A.30 the last display urther implies that 1 P ϕ kl D i D j θ q i<j 1 P ϕ kl D i D j θ q A [ + PA c [ i<j 1 P ϕ kl D i D j E[ϕ kl D i D j i<j 5θ + PA c [ 1m exp{ + c log p +.77 exp{ + c log p + p exp{ + c logp 5.77 exp{ 1 + c log p or positive absolute costat c. 3

Step II.3 Combiig results o Step II.1 Step II. ad Step I whe we have A.38 ad that { 56{3M κ x + M MK > max C 0 κ x Mκ x 3ep θ q log 4096K 1 M MK κ x logp 4θ qɛ θ we have P max v Π { 1 i<j g v D i D j µ v θ 5.77 exp{ c + 1 log p + exp c where c = θ 4θ/[56{3M κ x + M MK C 0 κ x Mκ x. Step III. Deote 1/ Γ = X h Σ1/ h. From Step II. we have that with probability at least 1 5.77 exp{ c + 1 log p exp c simultaeously or all v 0 Π which urther implies that Γv 0 θ Γv 0 θ 1/. The we obtai bouds o etire E J S p 1 by approximatio. For ay v E J S p 1 or some J = q deote v 0 = Πv. We have Deie Γ EJ which urther implies that = sup y EJ S p 1 Take ɛ = 1/ the we have Γv ΓΠv + Γ{v Πv. Γy. The by A.39 we have Γ EJ θ 1/ + ɛ Γ EJ Γ E J We take θ = 4θ. This completes the proo. A.11 Proo o Lemma 3.4 θ 1 ɛ. Γ E J 4θ. A.39 Proo. Deote µ = E [ gz 1 Z z = z µ gz i Z j = gz i Z j Z i Z j + µ ad D g = i<j gz i Z j. Also deote g = B g = B σ = E [ gz 1 Z ad B = sup E [ gz z z D = sup {E [ gz i Z j a i Z i b j Z j : E [ i<j i= a i Z i 1 E [ 1 j=1 b j Z j 1. 4

Hoedig decompositio gives us U g E[U g = 1 Z i + D g where D g is a degeerate U-statistic o bouded kerel. By Berstei iequality we have P Z i t t /8 1 exp 1 i=1 E [ Zi + B t/6 1 t / A.40 exp 8E [ Zi + B t/ whe 3. By Theorem 3.4 i Houdré ad Reyaud-Bouret 003 or ay u > 0 we have P D g C 1 σu 1/ + C Du/4 + C3 Bu 3/ + C 4 Bg u /4 C 5 e u i=1 A.41 where positive absolute costats C 1... C 5 are as deied i 3.4. Combiig A.40 ad A.41 we have P U g E[U g t + C 1 σu 1/ + C Du/4 + C3 Bu 3/ + C 4 Bg u P Z i i=1 t 1 t / exp 8E [ X + B + C 5 e u. t/ + P D g C 1 σu 1/ + C Du/4 + C3 Bu 3/ + C 4 Bg u /4 A.4 It is easy to see that B g B g + 3B 4B g B B ad E [ Z E [ Z. It remais to boud σ B ad D. By some algebra we have which implies that ad that Meawhile we have E [ gx 1 X X E [ gx1 X X σ = E [ gx 1 X = E [ E { gx 1 X X E [ E { gx 1 X X = E [ gx 1 X = σ B sup X E [ gx 1 X X sup X E [ gx 1 X X = B. E [ gx i X j X j 4B. 5

By Hölder s iequality ad combiig with the last display we have E [ gx i X j a i X i b j X j =E [ b j X j E { gx i X j a i X i X j Thereore we urther have E [ b j X j E { gx i X j Xj 1/E { gxi X j a i X i Xj 1/ 4B 1/ E [ b j X j E { gx i X j a i X i X j 1/ 4B 1/ E [ b j X j 1/ E [ gxi X j a i X i 1/ =4B 1/ E [ b j X j 1/ E [ ai X i E { gx i X j X i 1/ 4B E [ a i X i 1/ E [ bj X j 1/. D 4B 4B 4B. i 1 { [ E ai X i 1/ [ E bj X j 1/ i= j=1 i 1 i= j=1 1{ [ E ai X i + E [ b j X j Combiig these upper bouds o costats with A.4 we complete the proo. A.1 Proo o Corollary 3. Corollary A. Corollary 3.. Suppose Assumptios 4-6 ad 8-9 are satisied. 1 Assume Assumptio 7 holds ad that A.5 is satisied with q = 305s ad t = /16. The we have P δ L h κ lm l 4 or all { R p : S c 1 3 S 1 1 5.77 exp c log p exp c where c > 1 is a absolute costat ad c = κ l M l 64κ lm l /6 {3M κ x+m M K C 0 κ x Mκ x. Assume Assumptio 13 holds ad that A.5 holds with q = 305{s+ζ h γ / logp ad t = /16. The we have P δ L h κ lm l 4 or all C S 1 5.77 exp c log p exp c where C S = { v R p : v J c 1 3 v J 1 or some J [p ad J s + ζ h γ / logp c > 1 is a absolute costat ad c = κ l M l 64κ lm l /6 {3M κ x + M MK C 0 κ x Mκ x. Proo. 1 Deote C S = { v R p : v S c 1 3 v S 1. By Lemma 13 i Rudelso ad Zhou 013 C S S p 1 cov J d E J S p 1 where cov meas covex hull o a set E J = spa { e j : 6

j J ad d = 305s. Deote Γ = For ay v C S S p 1 we have Σ h = E W h X XT K h 1 { 1 Wij K Xij XT h i<j h ij Σ h [ Σ 0 = E X XT W = 0 0. W v T Γv 4 max v T Γv v cov J d E J S p 1 = 4 max v J d E J S p 1 v T Γv = 4 Γ d where the secod lie is because maximum o v T Γv occurs at extreme poits o set cov J d E J S p 1. Apply Theorem 3.7 with q = d = 305s ad t = /16 whe A.5 is satisied we have v T Γv κ lm l A.43 4 holds simultaeously or all v C S S p 1 with probability at least 1 5.77 exp c log p exp c where c > 1 is some absolute costat ad c = κ l M l 64κ lm l /[65536{M κ x + M MK C 0 κ x + M κ x Mκ x. A.43 urther implies that δ L v h v T Σ h v /4 where v T Σ h v v T Σ 0 v MM K E [ X T v h v MM K κ x v h v / = /. A.44 Thereore δ L v h /4 holds simultaeously or all v C S S p 1 with probability at least 1 5.77 exp c log p exp c. By liearity o δ L v h this completes the proo or 1. Usig a idetical argumet as used i 1 replacig C S by set { v R p : v J c 1 3 v J 1 or some J [p ad J s + ζ h γ / logp ad usig d = 305{s + ζ h γ / logp istead we complete the proo or. A.13 Proo o Lemma 4.1 Lemma A.10 Lemma 4.1. Assume h K 1 {logp/ 1/ or positive absolute costat K 1 ad assume h C 0 or positive costat C 0. We urther assume that u satisies Assumptio 17 ad take c ad c < 3ɛ/4 + 1/ to be positive absolute costats. We take ξ = 1 + c / + ɛ ad suppose we have { [{16c > max + 3 c + 1C0M u /+ɛ κ 1/3 ξ x 1 log p /3 ξ {logp 5/3 4ξ 7

The uder Assumptios 5 6 9 ad 17 we have P { max Uk E[U k C{logp/ 1/ 4.77 exp c log p + exp c log k [p where C = C 1 M 1/ K M 1/ Mu 1/+ɛ κ x c 1/ K 1/ 1 + C M c + 1/ 1/ c + 8M Mu 1/+ɛ κ x c 1/ + C 3 M 1/ K M 1/ c + 1/ c 3/ K 1/ 1 + C 4 M K c + 1/ c K1 1 Here M = M + MM K C 0 ad C 1... C 4 are as deied i 3.4. Proo. We apply trucatio o X ijk ad ũ i at levels τ ad θ / respectively ad irst ocus o U-statistic 1 1 Wij Ũ k = K Xijk ũ ij 1IA kij B i B j h h where we deote evets We also deote evets A kij = { X ijk τ Bi = { u i E[u θ /. A k[ = { X ijk τ i < j [ B [ = { u i E[u θ / i [. Deote gd i D j = 1 Wij K Xijk ũ ij 1IA kij B i B j ad D i = E [ gd i D j D i. h h We complete the proo i two steps. Step I. We boud B g B E [ D σ ad B as i Lemma 3.4 ad apply Lemma 3.4. For boudig B g we have B g M K τ θ /h. For boudig B apply Lemma 3.3 o ϕ = 1 with lemma coditios satisied by 5 ad 6 ad we have [ E 1 W1 W B τ θ K W 1 M τ θ h h where M = M + MM K C 0. For boudig σ we have σ = E [ gd 1 D M [ K 1 Wij E K Xijk ũ ij h h h M K [ E X h ijk ũ ij Wij = 0 M + MM K C 0 E [ X ijk ũ ij M K M M /+ɛ u κ x/h where the irst iequality is due to K [0 1 the secod iequality is by applyig Lemma 3. o Z = X ijk ũ ij with lemma assumptios satisied by Assumptios 5 ad 6 ad the last iequality is by Assumptios 9 10 ad idepedece o X ijk ad ũ ij. For boudig E [ D apply Lemma 3.3 o ϕ = X ijk ũ ij 1IA kij B i B j with lemma assumptios satisied by Assumptios 5 ad 6 ad we have D 1 D MM K C 0 D 8

where 1 D = E [ X1k ũ 1 1IA k1 B 1 B W 1 = W D W1 W D = E [ X 1k ũ 1 1IA k1 B 1 B D. We have by Assumptios 9 10 ad idepedece o X ijk ad ũ ij E [ 1 D E [ X 1k ũ 1 W 1 = W M MMu /+ɛ κ x This urther implies that E [ D E[ X 1kũ 1 M /+ɛ u κ x. E[D E[ 1 D + M M KC 0E[ D 4M M /+ɛ u κ x. For boudig B we have B = sup E [ gd 1 D D D M [ K 1 W1 W sup E K X 1k X k u 1 u 1IA k1 B 1 B D h D h h τ M K M θ. h We take or some positive absolute costat c > 1 t = 8M M 1/+ɛ u κ x c 1/ {logp/ 1/ τ = max { c 1/ {logp 1/ θ = α 0 < α < 3/4 c u = c log p ad we have that { [{16c > max 3 c + 1C0M u /+ɛ κ 1/3 α x 1 log p /3 α {logp 5/3 4α. The by Lemma 3.4 we have { 1 P Ũk E[Ũk A{logp/ 1/ exp c logp +.77 exp c log p where with C 1... C 4 deied i 3.4 A = C 1 M 1/ K M 1/ M 1/+ɛ 4.77 exp c log p u κ x c 1/ K 1/ 1 + C M c + 1/ 1/ c +C 3 M 1/ K M 1/ c + 1/ c 3/ K 1/ 1 + C 4 M K c + 1/ c K 1 1 + 8M M 1/+ɛ u κ x c 1/. 9

Step II. We have E[Ũk = 0 ad thus we have P { max Uk E[U k A{logp/ 1/ k [p P { max Uk E[U k A{logp/ 1/ B [ + PB c [ k [p p { P Uk > A{logp/ 1/ A k[ B [ + PA c k[ + PB[ c k=1 p { P Ũ > A{logp/ 1/ A k[ B [ + PA c k[ + PB[ c k=1 4.77 exp c log p + log p + E[ ũ +ɛ α+ɛ 4.77 exp c log p + log p + exp c log. The last iequality holds i we take c + 1/ + ɛ < 3/4 ad we take α = c + 1/ + ɛ. This completes the proo. A.14 Proo o Corollary 4.1 Assume h K 1 {logp/ 1/ or positive absolute costat K 1 ad assume h C 0 or positive costat C 0. We urther assume that u satisies Assumptio 17 ad take c ad c < 3ɛ/4 + 1/ to be positive absolute costats. We take ξ = 1 + c / + ɛ ad suppose we have { [{16c > max + 3 c + 1C0M u /+ɛ κ 1/3 ξ x 1 log p /3 ξ {logp 5/3 4ξ 64c + c + 1{logp 3 /3 3 48 6M K κ xq 10 K 1 p{logp 1/ 6 6M κ xq /3 144κ 4 x p K1 p logp [ 11 6 3 + c 1/ C 1 M 1/ K 1/ K M 1/ κ x 4/3 q 4/3 {logp 1/3 1 [ 8 6 0 + 7.5c + cc M κ 1/ x q 1/ logp [ 8 6c + 3/ C 3 {144 + c M K M κ 4 xk1 1 + 19M κ4 x + 8M κ 4 x 1/ 4 5 q 4 5 {logp 9 5 [ 10 6 6 + c 3 C 4 κ x K 1 /3 q /3 {logp 5/3 11 6 0 + 7.5cc + M κ x q{logp 0 {3M κ x + M M K C 0 κ x Mκ x 16 q log 4 K1 M MK κ x logp 6 3q 0 + 7.5cM κ x logp 6ep q A.45 30

where q is to be determied i speciic cases. Deote M = M + MM K C 0 ad C 1... C 4 are as deied i 3.4. Also deote c to be some positive absolute costat ad A = C 1 M 1/ K M 1/ M 1/+ɛ u κ x c 1/ K 1/ 1 + C M c + 1/ 1/ c+ C 3 M 1/ K M 1/ c + 1/ c 3/ K 1/ 1 + C 4 M K c + 1/ c K 1 1 + 8M M 1/+ɛ u κ x c 1/ c =κ l M l 64κ lm l /6 {3M κ x + M M KC 0κ x Mκ x. Theorem A.11 Corollary 4.11. Assume λ 4A + A {logp/ 1/ + 8κ xm ζh. Further assume A.45 holds with q = 305s. The uder Assumptios 4-9 11 1 ad 17 we have β h β 88sλ Ml κ l with probability at least 1 10.54 exp c log p exp c log exp c ɛ p. Proo. See Proo o Theorem A.1. Theorem A.1. [Corollary 4.1 Assume that λ 4A + A{logp/ 1/ + 8κ xm ζh γ. Further assume A.45 holds with q = 305s. The uder Assumptios 4-9 11 1 ad 17 we have β h β 88sλ Ml κ l with probability at least 1 10.54 exp c log p exp c log exp c ɛ p. Proo. We adopt the ramework as described i Sectio or θ = β Γ 0 θ = L 0 β Γ θ h = L β h Γ h θ = E L β h ad take θ h = β which yields s s ad ρ = 0. We veriy Assumptio by usig results A.4 A.6 ad applyig Lemma 4.1. We veriy Assumptio 3 by applyig Corollary 3.. We complete the proo by Theorem.1. Theorem A.13 Corollary 4.13. Deote C to be some positive absolute costat C > ζ C γ 0 ad suppose C ζ C γ 0 s logp. Assume that λ 4A + A + Mη {logp/ 1/ + 8MM K C 1/ κ xh. Further assume that A.45 holds with q = 305{s + ζ h γ / logp. The uder Assumptios 4-6 8-9 11-13 ad 17 we have β h β 88sλ s logp { 88λ Ml + + κ l Ml κ l logp + ζ h γ with probability at least 1 17.31 exp c log p exp c log exp c ɛ p. Proo. We adopt the ramework as described i Sectio or θ = β Γ 0 θ = L 0 β Γ θ h = L β h Γ h θ = E L β h ad take θ h = β which yields s s ad ρ = 0. We veriy Assumptio by usig results A.4 A.6 ad applyig Lemma 4.1. We veriy Assumptio 3 by applyig Corollary 3.. We complete the proo by Theorem.1. 31

Corollary A.3 Corollary 4.14. Deote ad τ 1 = + c 1/ κ x K 1 1 BM KC a 0 + DM K τ = + c 1/ κ x {BM K M1 + C 0 C a 0 + DM τ 3 = 4M KM BC a 0 + D 1 + C 0 κ x τ 4 = { 4B MM K κ x1 + C 0 C a γ 1 0 + D 1M κ 4 x 1/ E 1/ C 1/ γ 1 0 τ 5 = 4 + cκ x{bmm K 1 + C 0 C a 0 + D M M K K 1 1 MK K γ 1 1 A =4τ 1/ 3 1 + c 1/ + C 1 τ 1/ 4 1 + c 1/ + C τ 1 + c + C 3 τ 1/ 5 1 + c 3/ + C 4 τ 1 1 + c + 4M BC a 0 + D c + κ x where γ 1 = mi { a 1 1/. Cosider lower boud o { > max 64c + c + 1τ τ3 1 {logp4 {logp 5/3. A.46 Here B D E ad a are to be speciied i dieret cases. 1 Assume that g is L α-hölder or α 1 ad g has bouded support whe α > 1. Suppose A.45 holds with q = 305s ad that A.46 holds with B = L α where L α is the Lipschitz costat or g L α = L whe = 1 D = E = 0 a = 1. Further assume that λ 4A + A {logp/ 1/ + 8κ xm ζh where { L ζ = max 4 α MM K + MM K Eũ / 1/ 16κ x M + MM K C 0 1/ L αmm K. The uder Assumptios 4-6 7 8-9 14 ad 17 we have β h β 88sλ Ml κ l with probability at least 1 15.81 exp c log p exp c log exp c. Assume that Assumptio 15 holds with α 0 1. Suppose that A.45 holds with q = 305s ad that A.46 holds with B = M g D = M d E = M a ad a = α. Assume λ 4A + A {logp/ 1/ + 8κ xm ζh γ where { M g MM K C α γ 0 + Md ζ = max 4 M ac 1 γ 0 + MM K Eũ C γ 0 / 1/ 16κ x M + MM K C0 1/ Mg MM K C α γ 0 + Md M ac 1 γ 0 1/ γ = α i M d M a = 0 ad γ = mi { α 1/ i otherwise. The uder Assumptios 4-6 7 8-9 14 ad 17 we have β h β 88sλ Ml κ l with probability at least 1 15.81 exp c log p exp c log exp c. 3 Assume that Assumptio 15 holds with α [1/4 1. Suppose that A.45 holds with 3

q = 305{s + ζ h γ / logp ad that A.46 holds with B = M g D = M d E = M a ad a = α. Further assume λ 4A + A + Mη {logp/ 1/ + 8MM K Cκ xh where { M g MM K C α γ 0 + Md ζ = max 4 M ac 1 γ 0 + MM K Eũ C γ 0 / 1/ 16κ x M + MM K C0 1/ Mg MM K C α γ 0 + Md M ac 1 γ 0 1/ γ = α i M d M a = 0 ad γ = mi { α 1/ i otherwise. The uder Assumptios 4-6 7 8-9 14 ad 17 we have β h β 88sλ s logp { 88λ Ml + + κ l Ml κ l logp + ζ h γ with probability at least 1.58 exp c log p exp c log exp c. Proo. The result ollows directly rom Corollary 4.11-3. Theorem A.14 Corollary 4.15. Assume that A.45 holds with q = 305s. Assume urther that > 64c + c + 1{logp 4 ad λ 4A + A {logp/ 1/ + 4 M g M K Mκ x 1 + C 0 h where A =8MM K M g C 0 1 + C 0 κ x 1 + c 1/ + C 1 M g M 1/ M 3/ K κ1/ x 1 + C 0 1/ C 5/4 0 K 1/4 1 1 + c 1/ + C MM K M g 1 + C 0 κ x K 1 1 + c 3/ + 4C 3 MM 3/ K M 1/ g 1 + C 0 1/ C 1/ 0 κ x 1 + c + C 4 M K M g C 0 κ x K 1 1 1 + c5/ + MM K M g 1 + C 0 C 0 The uder Assumptios 4-9 16 ad 17 we have β h β 88sλ Ml κ l with probability at least 1 15.81 exp c log p exp c log exp c. Proo. We adopt the ramework as described i Sectio or θ = β Γ 0 θ = L 0 β Γ θ h = L β h Γ h θ = E L β h ad take θ h = β which yields s s ad ρ = 0. We veriy Assumptio by usig results A.19 A.1 A. A.3 ad applyig Lemma 4.1. We veriy Assumptio 3 by applyig Corollary 3.. We complete the proo by Theorem.1. A.15 Supportig lemmas Lemma A.15. Assumptio 5 implies that or ay 0 < a < 3 ad 0 < b < 1 we have + w a Kw dw M K ad sup w b Kw M K. w R Proo o Lemma A.15. For ay 0 < a < 3 we have + { w a + a/3 Kw dw w 3 a/3 Kw dw M K M K 33

where the irst iequality is by Hölder s iequality the secod is by Assumptio 5 ad that a > 0 ad the last is by the act that 0 < a < 3 ad the choice o M K 1. For ay 0 < b < 1 ad ay w R we have w b Kw = { w Kw b Kw 1 b M b KM 1 b K = M K where the irst iequality is by Assumptio 5 ad that 0 < b < 1. Thereore we have obtaied that sup w R w b Kw M K. This completes the proo. Lemma A.16. Assumptio 6 implies that or ay X-measurable uctio ψ : R p R m mappig to a m-dimesioal real space we have { W ψ w z X w W ψ w z X W w w M. A.47 w W sup wz Proo o Lemma A.16. For a uctio F we write df x/dx = F x+ F x where F x+ ad F x are right ad let limits respectively whe F x is discotiuous at x. We irst show that sup wx { W Xw x w W Xw x M. We have FW1 X F W w = 1 =x +xw + w df X 1 x dx x =x +x df W X =x w df X x X=x dfx1. x dx x =x +x df X x By domiated covergece theorem we have W1 W Xw X x = 1 w + w x + x df X 1 x dx x =x +x df W X =x w df X x dfx1 M x dx x =x +x df X x ad W Xw x W1 X 1 w +wx +x df X1 x w dx = x =x +x df W X =x w df X x w dfx1 M. x dx x =x +x df X x Based o the same argumet we have w = F W F W w df X=x Xx which by domiated covergece theorem implies that w = W W Xw x df Xx M ad W w W = Xw x df Xx M w w Also or ay X-measurable uctio ψ we have v 1I{ψx vf W w df X=x Xx v=z F W ψ w =. X=z v 1I{ψx v df Xx v=z 34

By domiated covergece theorem we have v 1I{ψx v W W ψ w z = Xw x df Xx v=z M X v 1I{ψx v df Xx v=z ad W ψ w z X v 1I{ψx v W Xwx w df Xx v=z = M. w 1I{ψx v df Xx v=z Thereore Assumptio 6 implies A.47 v Lemma A.17. Assumptio 9 implies coditioal o W = 0 ad ucoditioally X v is meazero subgaussia with parameter at most κ x v or ay v Rp. Assumptio 10 implies that ũ is mea-zero subgaussia with parameter at most κ u. Proo o Lemma A.17. Observe that X T v ad X T v are idetically distributed ad thus we have E[ X T v = 0. We have that the momet geeratig uctio o X T v is E [ e t X T v = E [ e t X 1 Tv E[XT 1 v [ E e t X Tv+E[XT v e t κ x v where the irst iequality is because X 1 ad X are i.i.d. ad the secod is a applicatio o Assumptio 9. Thereore XT v is mea-zero subgaussia with parameter at most κ x v. Observe that coditioal o W = 0 XT v ad X T v are idetically distributed ad thus we have E[ X T v W = 0. We have that the momet geeratig uctio o X T v coditioal o W = 0 is E [ e t X T v [ W = 0 = E E e t X T v W1 = W W = E [ E { e t X1 Tv E[XT 1 v W 1=W W1 { = W E e t X Tv+E[XT W v W e t κ x v where the secod iequality is because X 1 W 1 ad X W are i.i.d. ad the third is a applicatio o Assumptio 9. Thereore coditioal o W = 0 XT v is mea-zero subgaussia with parameter at most κ x v. Apply the same argumet o u we complete the proo. I our proos we used the ollowig results rom Vershyi 01. Lemma A.18. For mea-zero subgaussia radom variable V with parameter at most κ v we have E[V κ v E[V 4 3κ 4 v PV E[V v 1 exp{ v/κ v or ay v κ v ad that E[e sv se[v e s κ 4 v or s κ v 1. Lemma A.19. Let Z be some subgaussia radom variable with parameter at most κ z. Suppose κ z a/4 or some a > 0. The we have a z df Z z a + 4κ z exp{ a/4κ z. Proo o Lemma A.19. We have F Z z PZ E[Z z/ 1 exp{ z/4κ z or ay 35

z a 4κ z Lemma A.18. By itegratio by parts we have z df Z z = z d { 1 F Z z This completes the proo. a a = z { 1 F Z z a + 1 F Z z dz a exp{ a/4κ z + = a + 4κ z exp{ a/4κ z. a a exp{ z/4κ z dz Proo o Lemma 3.. By Taylor s expasio or some t wh [0 1 we have W E h K Z = Kwz h W Z wh z dw df Z z = Kwz { W Z 0 z + W Zw z w wh dw df Z z twh wh which implies that This completes the proo. [ 1 W E h K Z h E[Z W = 0 W 0 M 1 M E[ Z h. Proo o Lemma 3.3. By Taylor s expasio or some t wh [0 1 we have E h KW 1 W ϕz 1 Z W Z h 1 w = h K W ϕz Z h W1 Z 1 w z dw df Z1 z = Kwϕz Z W1 Z 1 W + wh z dw df Z1 z = Kwϕz Z { W1 Z 1 W z + W 1 Z 1 w z w wh dw df Z1 z W +t wh wh which implies that [ 1 E h K W1 W ϕz 1 Z W Z E [ ϕz 1 Z W Z W 1 = W W1 W h M 1 M E [ ϕz 1 Z Z h. This completes the proo. Lemma A.0. Let D i = X i V i W i be i.i.d. or i = 1... ad K be a positive kerel uctio such that Kw dw = 1 ad that max { + w Kw dw sup w R Kw M K or positive absolute costat M K. Assume that coditioal o W i = w or ay w i the rage o W i ad ucoditioally X i ad V i are subgaussia with parameters at most κ x ad κ v respectively or positive absolute costats κ x ad κ v. Assume that there exists positive absolute costat M 36

such that { W XV w x v max w W XV w x v M or ay w x v R such that the desities are deied. Take h K 1 {logp/ 1/ or positive absolute costat K 1 ad assume that h C 0 or positive costat C 0. Suppose > max { 64c+ c + 1{logp 3 /3 3 or positive absolute costat c. Cosider U-statistic U = i<j { 1 h K Wi W j X i X j V i V j h. The we have where { 1 P U E[U { logp 1/ C 6.77 exp{ c + 1 log p C ={16 31 + c 1/ M + 4 3C 1 1 + c 1/ M 1/ K 1/ 1 + 8C 1 + c + 8C 3 1 + c 3/ M 1/ K M 1/ K 1/ 1 + 8C 4 1 + c M K K 1 1 + 8M c + κ x κ v with C 1... C 4 as deied i 3.4 ad M = M + MM K C 0. A.48 Proo o Lemma A.0. Deote Z ij = X i X j V i V j. We apply trucatio to X i X j at level Cx logp ad to V i V j at level Cy logp or some positive absolute costats C x ad C v. Deote A [ = { X i X j Cx logp V i V j C v logp i j [ i < j ad irst ocus o U-statistic Ũ = i<j h K Wi W j Z ij 1I{X i X j Cx logp V i V j C v logp. h Deote gd i D j = 1 Wi W j K Z ij 1I{X i X j Cx logp V i V j C v logp h h ad D i = E [ gd i D j D i. Assume h K 1 {logp/ 1/ or some positive absolute costat K 1. Deote X = X 1 X Ṽ = V 1 V ad W = W 1 W. Note that by argumet o Lemma A.16 we have all the ecessary smooth coditios o desities. Deote C = C x C v ad ote that X i X j Cx logp V i V j C v logp implies that Z ij C logp. Step I. We boud B g B E [ D σ ad B as i Lemma 3.4 ad apply Lemma 3.4. We have B g CM K logp/h CM K /K 1 { logp 1/. For B apply Lemma 3.3 o ϕ = 1 ad with M 1 = M M = M K ad we have Wi W j B C logp E K W j h h C logp{ W W j + MM K C 0 CM logp where M = M + M K MC 0 ad the last iequality used the act that W W j [0 M. 37

For boudig E [ D apply Lemma 3.3 o ϕ = Z ij 1I { X i X j C x logp V i V j C v logp ad with M 1 = M M = M K ad the we have where Thereore we have ad meawhile D 1 D M K M D h 1 D E [ Z 1 1I Z 1 C logp W 1 = W D W1 W D E [ Z 1 1I Z 1 C logp D. E [ D = E [ {D 1 D + 1 D M KM C 0E [ D + E [ 1 D E [ 1 D M E[Z 1 M E[ X 4 1/ E[Ṽ 4 1/ 1M κ xκ v ad E [ D E[Z 1 E[ X 4 1/ E[Ṽ 4 1/ 1κ xκ v. A.49 A.50 where the irst iequalities are by Jese s iequality the secod are by Cauchy-Schwarz iequality ad the third are due to the act that E[ X 4 1κ x E[Ṽ 4 1κ v Lemma A.18. Combiig A.49 ad A.50 we have E [ D M k M C 0 + M 4κ xκ v < 4M κ xκ v. A.51 For boudig σ apply Lemma 3. o Z = Zij ad with M 1 = M M = M K ad the we have E [ gd i D j M [ K 1 Wi W j E K Zij h h h M K { E[Z h ij W i = W j M + MM K C 0 E[Zij M K h {E[ X 4 W = 0 1/ E[Ṽ 4 W = 0 1/ M + MM K C 0 E[ X 4 1/ E[Ṽ 4 1/ M K {1κ h xκ vm + 1κ xκ vmm K C 0 1κ xκ vm K M 1/ K 1 logp where the third iequality is by Cauchy-Schwarz iequality ad the ourth is due to subgaussiaity o X ad Ṽ both coditioal o W = 0 ad ucoditioally. For boudig B we have B = sup E [ gd 1 D D D M K h sup E D K h M K h C logp E h K C M M K K 1 { logp 3/ W1 W Z1 1I { Z 1 C logp D h W1 W h D 38