Estimation of integrated volatility of volatility with applications to goodness-of-fit testing

Beroulli 21(4, 215, 2393 2418 DOI: 1.315/14-BEJ648 arxiv:126.5761v2 [math.st] 29 Sep 215 Estimatio of itegrated volatility of volatility with applicatios to goodess-of-fit testig MATHIAS VETTER 1 Fakultät für Mathematik, Ruhr-Uiversität Bochum, 4478 Bochum, Germay. E-mail: mathias.vetter@rub.de I this paper, we are cocered with oparametric iferece o the volatility of volatility process i stochastic volatility models. We costruct several estimators for its itegrated versio i a high-frequecy settig, all based o icremets of spot volatility estimators. Some of those are positive by costructio, others are bias corrected i order to attai the optimal rate 1/4. Associated cetral limit theorems are prove which ca be widely used i practice, as they are the key to essetially all tools i model validatio for stochastic volatility models. As a illustratio we give a brief idea o a goodess-of-fit test i order to check for a certai parametric form of volatility of volatility. Keywords: cetral limit theorem; goodess-of-fit testig; high-frequecy observatios; model validatio; stable covergece; stochastic volatility model 1. Itroductio Nowadays, stochastic volatility models are stadard tools i the cotiuous-time modellig of fiacial time series. Typically, the uderlyig (log price process is assumed to follow a diffusio process of the form X t =X + µ s ds+ σ s dw s, (1.1 where µ ad σ ca be quite geeral stochastic processes themselves. A classical case is where the volatility σ 2 s =σ 2 (s,x s is a fuctio of time ad state a situatio referred to as the oe of a local volatility model. It has tured out i empirical fiace that such models do ot fit the data very well, as some stylised facts such as the leverage effect or volatility clusterig caot be explaied usig local volatility oly. Stochastic volatility models, however, are able to reproduce such features, as they bear a additioal source of radomess. I these models, the volatility process is a diffusio process itself, ad we This is a electroic reprit of the origial article published by the ISI/BS i Beroulli, 215, Vol. 21, No. 4, 2393 2418. This reprit differs from the origial i pagiatio ad typographic detail. 135-7265 c 215 ISI/BS

2 M. Vetter focus o a rather geeral situatio, amely σt 2 =σ2 + ν s ds+ β s dw s + η s dw s, (1.2 where ν, β ad η agai are suitable stochastic processes ad W is aother Browia motio, idepedet of W. This model obviously icludes the widely used special case of a volatility with oly oe drivig Browia motio, which is dσ 2 t =ν tdt+τ t dv t, where V ad W are joitly Browia with some correlatio ρ. Stochastic volatility models are typically parametric oes, ad probably the prime example amog those is the Hesto model of [14], give by X t =X + ( β σ2 s 2 ds+ σ s dw s, σt 2 =σ2 +κ (α σs 2 ds+ξ σ s dv s, for some parameters β,κ,α ad ξ, ad with Corr(W,V=ρ. Here, the volatility process follows a Cox Igersoll Ross model, that meas it is mea-revertig with mea α ad speed κ, ad both diffusio coefficiets are proportioal with parameter ξ. Particularly the latter property appears to be rather typical for stochastic volatility models, ad i this sese the Hesto model ca be regarded as prototypic. Popular alteratives are for example comig from the more geeral (but agai parametric class of (oe factor CEV models, where the diffusio coefficiet τ of σ 2 becomes a geeral power fuctio of σ, whereasthe drift part ofthe volatility remaisi priciple the same. See [21] for a survey. For this reaso, statistical iferece for stochastic volatility models has focused o parametric methods for most times, ad usually the authors provide tools for a specific class of models. However, oe is faced with two severe problems: First, it is i most cases impossible to assess the distributio of X (or its icremets, which makes stadard maximum likelihood theory uavailable. Secod, the volatility process σ 2 is ot observable, ad may statistical cocepts have i commo that they propose to reproduce the ukow volatility process from observed optio prices, typically by usig proxies based o implied volatility. A survey o early estimatio methods i this cotext ca be foud i [8]. Oe remarkable exceptio where stock price data oly is used is the paper of [7] who costruct a GMM estimator for the parameters of the Hesto model from icremets of realised variace. But also i a geeral settig with o specific model i mid, the focus has bee o parametric approaches. A early approach o parameter estimatio whe σ 2 is ergodic is the work of [12], optimal rates are discussed i [15] ad [13], ad a maximum likelihood approach based o proxies for the volatility ca be foud i[1]. Eve oparametric cocepts have bee used to idetify parameters of a stochastic volatility model; see, for example, [3] or [25]. Geuie oparametric iferece for stochastic volatility models has typically focused o fuctio estimatio. Both [24] ad [9] discuss techiques for the estimatio of f ad g,

Itegrated volatility of volatility 3 whethevolatilityprocesssatisfiesdσt 2 =f(σ2 t dt+g(σ2 t dv t.ithemoregeeralmodelfree cotext of(1.2, oly [4] ad [28] have discussed estimatio of fuctioals of volatility of volatility. While the latter focus o estimatio of a kid of leverage effect which ivolves the volatility of volatility process(es, the work of [4] provides a cosistet estimator for itegratedvolatilityofvolatility τ2 s ds i theoe-factorcase.theirapproachisispired by the asymptotic behaviour of realised variace, which states that the sum of squared icremets of σ 2 coverges i probability to the quatity of iterest. Sice σ 2 is ot observable, the authors use spot volatility estimators istead. We will pursue their approach ad discuss i detail the asymptotic behaviour of several estimators for itegrated volatility of volatility, all based o icremets of spot volatility estimators, thus usig observatios of X oly. It turs out that i order to attai the optimal rate of covergece i this cotext, it is ecessary to coduct a certai bias correctio which destroys positivity of the estimator a feature which is well kow from the related problem of volatility estimatio uder microstructure oise. Several stable cetral limit theorems are provided, ad by defiig appropriate estimators for the asymptotic (coditioal variace we obtai feasible versios as well. The latter results are of theoretical iterest o oe had, but are extremely importat from a applied poit of view as well, as they make model validatio for stochastic volatility models possible. Give the tremedous umber of such models with etirely differet qualitative behaviours, there is a lack of techiques that help decidig whether a certai model fits the data appropriately or ot. As a first approach to model validatio i this framework, we give a brief idea o how to do goodess-of-fit testig, but our method is by o meas limited to it. Related procedures ca be used to test for example, whether a Browia compoet or jumps are preset i the volatility process ad what i geeral the structure of the jump part is. Such problems have bee solved for the price process X i recet years (see [18] for a overview, ad i priciple the methods are all based o the estimatio of plai itegrated volatility σ2 sds ad further quatities, such as trucated versiosor bipower variatio. Usig our mai results, these cocepts ca be traslated to the stochastic volatility case by usig estimators for itegrated volatility of volatility istead, but usually with the slower rate of covergece 1/4. The paper is orgaised as follows: I Sectio 2, we itroduce our estimators ad state the cetral limit theorems, whereas Sectio 3 is o goodess-of-fit testig i stochastic volatility models. Some Mote Carlo results ca be foud i Sectio 4, followed by some cocludig remarks i Sectio 5. A overview o some proofs plus a couple of details ca be foud i the Appedix, whereas large parts of them have bee relegated to a supplemetary article [26]. 2. Mai results Let us start with some coditios o the processes ivolved. All of these are rather mild ad covered by a variety of(stochastic volatility models used. The oly major restrictio is that we will assume most processesto be cotiuous for a while ad oly discuss briefly

4 M. Vetter later how possible adjustmets i order to hadle jumps i price ad volatility could look like. Assumptio 2.1. Suppose that the process X is give by (1.1, where W is a stadard Browia motio ad the drift process µ is left cotiuous. We assume further that the volatility process σ 2 is a cotiuous Itô semimartigale itself, havig the represetatio (1.2. ν is assumed to be left cotiuous as well, whereas β satisfies the regularity coditio βs 2 =β2 + ω s ds+ ϑ (1 s dw s + ϑ (2 s dw s, (2.1 where ω is locally bouded ad each ϑ (l is left cotiuous,,2. A similar coditio is assumed to hold for η as well. Fially, all processes are defied o the same probability space (Ω,F,(F t t,p, ad all coefficiets are specified i such a way that σ 2 is almost surely positive ad that β 2 ad η 2 are either almost surely positive or vaishig idetically, respectively. As oted i the Itroductio, (1.2 covers a large class of volatility models used. For η we are essetially i the case of a local volatility model, whereas β s = ρτ s ad η s = 1 ρ 2 τ s for some process τ ad ρ ( 1,1 refers to the settig of the typical stochastic volatility models metioed before, i which both drivig Browia motios arecorrelatedwith ρ.themodeli(1.2isevemoreflexible,aditisstraight-forwardto extedallresultstothecaseofamulti-factormodeldrivebymorethatwoidepedet Browia motios as well. Our aim i the followig is to draw iferece o the itegrated volatility of volatility up to time t, which becomes (β2 s +η2 s ds i our cotext. Ay statistical iferece will be based o high-frequecy observatios of X, ad we assume that the data is recorded at equidistat times. Thus, without loss of geerality let the process be defied o the iterval [,1] ad observed at the time poits i/, i=,...,. Before we discuss several cocepts to assess itegrated volatility of volatility i detail, let us recall the priciples of estimatio of stadard itegrated volatility 1 σ2 sds. The usual estimator i the geeral model-free settig of (1.1 is realised volatility, give by RV t t = i=1 ( i X2, where we set i Z = Z i/ Z (i 1/ for ay process Z. This estimator is optimal i several respects, eve though Itô formula proves i/ i/ ( i X 2 = σsds+2 2 (X s X (i 1/ dx s (2.2 (i 1/ (i 1/ oly,fromwhichitissimpletoseethateachsquaredicremet( i X2 isolyoaverage equal to itegrated volatility over the correspodig time iterval, but ot cosistet for

Itegrated volatility of volatility 5 it. (Realised volatility, the sum of the squared icremets, however, is cosistet for the etire itegrated volatility, which is basically due to a martigale argumet. Our estimators for itegrated volatility of volatility will be based o a similar ituitio: Defie statistics via sums of icremets such that each summad is o average equal to itegrated volatility of volatility over the correspodig time iterval, but ot ecessarily cosistet. As before, oe would like to build those estimators upo icremets of σ 2. These are i geeral ot observable, so a proxy for them is eeded. Sice we are i a model-free world, a atural estimator for spot volatility σi/ 2 is give by ˆσ i/ 2 = k ( i+j k X2, i=,..., k, j=1 for some auxiliary (iteger-valued sequece k. See [2] or [25] for details o the asymptotic behaviour of this estimator. Itô formula agai gives ˆσ i/ 2 = k (i+j/ 2 (X s X (i+j 1/ dx s + (i+k/ σs 2 k (i+j 1/ k ds=:a i +B i, (2.3 j=1 so that ˆσ i/ 2 σ2 i/ cosists of two sources of error. From the proofs later o, we see that A i =O p( 1/k, whereas Bi σ2 i/ =O p( k /. Therefore, it appears atural to choose k to be of the order 1/2 i order to miimize the error of the spot volatility estimator (ad we will see later that this is ideed the best thig to do, but we will keep this sequece arbitrary i order to allow for other estimators as well. While the choice of the spot volatility estimators depeds o the auxiliary sequece k, we will itroduce a secod sequece of itegers l which govers the legth of the itervals over which icremets of ˆσ 2 are computed. Thus, the basic elemet of our fial estimators will be (ˆσ 2 (i+l / ˆσ2 i/ 2, which ca be decomposed as (ˆσ 2 (i+l / ˆσ2 i/ 2 =(A i+l A i 2 +(B i+l B i 2 +2(A i+l A i (B i+l B i. The average behaviour of the terms above is discussed i the followig lemma, ad it depeds crucially o the size of both k ad l. Lemma 2.2. Suppose that Assumptio 2.1 holds ad let E i [Z] deote coditioal expectatio of some variable Z with respect to F i/. Set also M = max(k,l ad m =mi(k,l. The we have E i [(A i+l A i 2 ]=4l (k M 1 σ 4 i/ (1+O p(m 1/2 1/2, E i [(B i+l B i 2 ]=l m (M m /3(k M 1 (β 2 i/ +η2 i/ (1+O p(m 1/2 1/2. The previous lemma gives us several hits o how to obtai a estimator for itegrated volatility of volatility via sums over (ˆσ 2 (i+l / ˆσ2 i/ 2. First, iformatio about β 2 i/ + i/

6 M. Vetter ηi/ 2 is cotaied i icremets over the B i oly. Therefore, it appears to be reasoable to choose k ad l later o i such a way that these terms are at least ot smaller tha the bias terms due to icremets of A i. Or i other words, the coditio becomes that Ck l for some geeric C >. Also, there are basically two ways to costruct a estimator. Either, pick k ad l such that the bias due to icremets of A i is egligible eve after dividig by the rate of covergece. This cocept will lead to the estimator t (k +l ˆT t = k M (l m (M m /3 1 (ˆσ (i+l 2 / ˆσ2 i/ 2 i= which is positive by costructio. As oted i the Itroductio, this is the kid of estimator [4] were lookig at. Alteratively, oe ca use a bias correctio ad subtract a estimator for the local quarticity σi/ 4. I this case oe loses positivity, but we will see later that the rate of covergece is much faster i this situatio. Let us pursue the first path for a momet, however. I order to uderstad what the rate of covergece for estimatio of itegrated volatility of volatility will be, the ext result is extremely helpful, as it gives the cetral limit theorem for the oracle estimator t (k +l Ŝt = k M (l m (M m /3 1 (Bi+l Bi 2 i= which depeds o the uobservable icremets of B i oly. All results i this sectio will be poitwise i t, eve though it is likely that fuctioal versios hold as well. Propositio 2.3. Suppose that Assumptio 2.1 holds ad that both k c α ad l d β hold for some α,β (,1 ad c,d>. Let also M ad m be defied as before. (a If α β, we have M ( Ŝ t (b For k =l we have M ( Ŝ t (βs 2 +ηsds 2 L (s 4/3 (βs 2 +ηsds 2 L (s 151/7 (β 2 s +η 2 sdw s. (β 2 s +η 2 sdw s. I both cases, W is a Browia motio defied o a extesio of the origial probability space ad idepedet of F ad the covergece i (2.6 is F-stable i law. Remark 2.4. It is obvious from Propositio 2.3 that the rate of covergece becomes faster the smaller M is chose. O the other had, the coditio Ck l forces M to be at least of the order 1/2. I this case, the rate of covergece i Propositio 2.3

Itegrated volatility of volatility 7 becomes 1/4, ad this rate is kow to be optimal for this statistical problem. Ideed, a related parametric settig has bee discussed i [15] a decade ago, ad it was show therei that this rate is optimal i the special case, where β vaishes idetically ad η is a fuctio of time ad state, kow up to a parameter θ. Our first mai theorem specifies coditios for a cetral limit theorem for ˆT t ad is a simple cosequece of Lemma 2.2 ad Propositio 2.3. Theorem 2.5. Suppose that all the assumptios of Propositio 2.3 hold true. If further 3/2 M 3/2 m 1 ad α β, the the stable cetral limit theorem t (ˆT t (βs 2 +η2 s ds L (s 4/3 (βs 2 +η2 s dw s. (2.4 holds true. M The optimal rate of covergece i this case is obtaied for the choice of M = O( 3/5+ε ad m = O( 3/5 ad approaches 1/5 for ε. This proves also that it is o restrictio to assume α β above. I order to obtai a estimator with the optimal rate of covergece, we choose l ad k to be both the same ad of the order 1/2, but as oted above we eed a bias correctio the. Therefore, we defie with a slight abuse of otatio ˆR t = t 2k i= ( 3 (ˆσ (i+k 2 2k / ˆσ2 i/ 2 6 1 k 2 ˆσ i/ 4, (2.5 where ˆσ 4 i/ = 2 3k k j=1 i+j X 4 is i geeral differet from (ˆσ 2 i/ 2. Its asymptotic behaviour is discussed i the followig theorem. Theorem 2.6. Suppose that Assumptio 2.1 holds ad let k =c 1/2 +o( 1/4 for some c>. The ( t ˆR t (βs 2 +ηsds 2 L (s U t (2.6 k for all t>, where the limitig variable has the represetatio U t = α s dw s, α 2 s = 48 c 4σ8 s + 12 c 2σ4 s(β 2 s +η 2 s+ 151 7 (β2 s +η 2 s 2. (2.7 Remark 2.7. The situatio ecoutered above has a iterestig coectio to the problem of elimiatig microstructure oise, as we face similar problems regardig optimal rates of covergece ad positivity of the estimators. Whereas the optimal rate of covergece for estimatig itegrated volatility i the oisy settig is 1/4, stadard estimators attaiig this rate are ot always positive. To esure positivity, oe typically

8 M. Vetter accepts a drop i the rate of covergece to 1/5 as well. See, for example, [6] for a thorough discussio i a geeral multivariate settig. Remark 2.8. Recetly, [19] discussed efficiet estimatio of g(σ2 s ds for geeralfuctios g. It tured out that Riema sums based o g(ˆσ i/ 2 ideed attai the optimal rate of covergece 1/2 i this cotext, but agai the choice of k affects the limitig distributio. The optimal k 1/2 leads to additioal bias terms i their settig, ad at least some of these ca be avoided by choosig k i a differet way. The limitig distributio i Theorem 2.5 ad Theorem 2.6 is mixed ormal, ad i order to obtai a feasible cetral limit theorem we have to itroduce cosistet estimators for the respective coditioal variaces. These are costructed usig the same ituitio as before, ad precisely we obtai the followig theorem. Theorem 2.9. ( a Uder the coditios of Theorem 2.5, we have t (k +l ˆQ 4k 2 t = M 2 9(l m (M m /3 2(ˆσ2 (i+l / ˆσ2 i/ 4 P 4 3 (β2 s +ηs 2 2 ds. i= (b I the situatio of Theorem 2.6, we have G (1 t, = 1 t k (ˆσ i/ 4 2 P i=1 t 2k G (2 t, = i=1 t 2k G (3 t, = i=1 σ 8 sds, ( 3 (ˆσ (i+k 2 2k / ˆσ2 i/ 2 6 1 k 2 ˆσ i/ 4 ˆσ i/ 4 k 2 Therefore, as a cosequece (ˆσ 2 (i+k / ˆσ2 i/ 4 P P ˆP t = 453 28 G(3 t, 486 k 2 35 G(2 t, 2 138 k 4 35 G(1 P t, σ 4 s (β2 s +η2 s ds, ( 48 c 4σ8 s + 16 c 2σ4 s (β2 s +η2 s + 4 3 (β2 s +η2 s 2 ds. α 2 sds. Remark 2.1. Theorem 2.9 shows that a cosistet estimator for (β2 s +η 2 s 2 ds is for example, give by 3 4 G(3 t, 12 k 2 G (2 t, 362 k 4 G (1 t,, ad its proof suggests that a cetral limit theorem holds with the same rate of covergece as before. I geeral, it is quite likely that this methods provides estimates for arbitrary eve powers of itegrated volatility of volatility. A precise theory is left for future research.

Itegrated volatility of volatility 9 The properties of stable covergece guaratee that dividig by the square root of a cosistet estimator for the coditioal variace gives a feasible cetral limit theorem for the estimatio of itegrated volatility of volatility. See, for example, [23] for details. Therefore, the followig corollary ca be cocluded easily. Corollary 2.11. (a Uder the assumptios of Theorem 2.5, we have for all t> M (ˆT t (β 2 s +η2 s ds (ˆQ t 1/2 L N(,1. (2.8 (b Uder the assumptios of Theorem 2.6, we have for all t> k ( ˆR t (βs 2 +ηsds 2 (ˆP t 1/2 N(,1. L (2.9 Remark 2.12. So far we have oly discussed the case where both processes have cotiuous paths. Extesios to the situatio of additioal jumps i the price process seem to be possible, but are already quite ivolved. The followig observatio is useful: Wheever there is a jump withi the iterval [i/,(i+2k /], it appears squared ad blow up by 2 /k 2 withi (ˆσ2 (i+k / ˆσ2 i/ 2. This is a much larger order tha the usual k / i the cotiuous case. For this reaso,it appearsas if the trucatio method due to [22] ca be applied, ad a similar ituitio holds for the bias correctio as well. Note, however, that the raw statistics i this cotext are sums of squared icremets of X rather tha plai icremets of X as for the power variatios ecoutered i [22]. Therefore, the required techiques are differet tha the stadard oes i this area. The case of jumps i the volatility appears to be eve more complicated, as these come ito play via Bi+k Bi = (i+k/ (σs+k 2 k / σ2 s ds, i/ ad therefore the amout to which each icremet is affected by a jump depeds crucially o the time at which the jump occurs. Thus, plai trucatio might ot be sufficiet i this case ad a etirely differet estimator was ecessary. Both topics are left for future research. 3. Model checks for stochastic volatility models I this sectio, we propose a first approach to goodess-of-fit testig for stochastic volatility models. Assume we have represetatio (1.1 for the log price process X, whereas the volatility process satisfies dσ 2 t =ν t dt+τ t dv t as i typical SV models. There is still a lot of freedom i the modellig of σ 2, ad the various proposals i the literature typically differ i the represetatio of its diffusio part τ. As oted i the Itroductio, a quite geeral class of stochastic volatility models is give by the so-called CEV models, i

1 M. Vetter which τ 2 s =θ(σ2 s γ for some oegative γ ad a ukow parameter θ, ad the most popular amog these is the Hesto model from [14], correspodig to γ =1. I order to costruct a test whether a certai fuctioal relatioship betwee σ ad τ ispreset,weemployatechiquewhich wasalreadyusedi [1] or[27] whe dealigwith localvolatilitymodels.supposeweareiteresteditestigforτ 2 s =τ 2 (s,x s,σ 2 s,θ,where τ 2 is a give fuctio ad θ is some ukow (i geeral multidimesioal parameter. For simplicity, we will focus o the oe-dimesioal liear case oly, that is H :τ 2 s =θτ2 (s,x s,σ 2 s for all s [,1] (a.s. Extesios to the geeral case follow alog the lies of Sectio 5 i [27]. A test for the ull hypothesis will be based o the observatio that H is equivalet to N t = for all t [,1] (a.s., where the process N t is give by N t = θ mi =argmi θ (τ 2 s θ mi τ 2 (s,x s,σ 2 sds, 1 (τ 2 s θτ2 (s,x s,σ 2 s 2 ds. Assume that the fuctio τ 2 is bouded awayfrom zero. The a stadard argumet from Hilbert space theory shows that θ mi =D 1 C (ad therefore N t =R t B t D 1 C, where we have set R t = τ2 s ds ad B t = D = C = 1 1 τ 2 (s,x s,σ 2 s ds, τ 4 (s,x s,σ 2 sds, τ 2 sτ 2 (s,x s,σ 2 sds. To defie estimators let k as before ad recall (2.5. We set ˆτ 2 i/ =3(2k 1 (ˆσ 2 (i+k / ˆσ2 i/ 2 6k 2 ˆσ4 i/ (3.1 ad also ˆN t = ˆR t ˆB t (ˆD 1 Ĉ with ˆR t from the previous sectio, whereas we deote ˆB t = 1 ˆD = 1 t k i= k i= τ 2 ( i,x i/,ˆσ 2 i/ τ 4 ( i,x i/,ˆσ 2 i/,,

Itegrated volatility of volatility 11 Ĉ = 1 2k i= ˆτ 2 i/ τ2 ( i,x i/,ˆσ 2 i/ I the sequel, we will prove weak covergeceof ˆN t N t, up to a suitable ormalisatio. Theorem 2.6 suggests that /k is a reasoable choice, ad the followig claim proves that two of the estimators coverge at a faster speed, at least if we impose a additioal smoothess coditio o the fuctio τ 2. Lemma 3.1. Suppose that the fuctio τ 2 has cotiuous partial derivatives of secod order. The we have. ˆB t B t =o p ( 1/4, ˆD D=o p ( 1/4, the first result holdig uiformly i t [,1]. The above claim idicates that we have to focus o the terms ivolvig ˆτ 2 i/ oly, which is familiar groud due to the results of Sectio 2. We start with a propositio o the joit asymptotic behaviour of ˆR t ad Ĉ. Lemma 3.2. Let d be a iteger ad t 1,...,t d be arbitrary i [,1]. Set Σ t1,...,t d (s,x s,σ 2 s=α 2 sh t1,...,t d (s,x s,σ 2 sh t1,...,t d (s,x s,σ 2 s T with h t1,...,t d (s,x s,σs 2=(1 [,t 1],...,1 [,td ],τ 2 (s,x s,σs 2T ad α 2 s as i Theorem 2.6. Uder the previous assumptios we have the stable covergece (ˆR t k 1 R t1,..., ˆR t d R td,ĉ C T L (s 1 Σ 1/2 t 1,...,t d (s,x s,σ 2 sdw s, where W is a (d + 1-dimesioal stadard Browia motio defied o a extesio of the origial space ad idepedet of F. We are iterested i the asymptotics of the process A (t= /k ( ˆN t N t, ad the precedig lemma basically leads to its fiite dimesioal covergece. The etire result o weak covergece of A reads as follows. Theorem 3.3. Assume that the previous assumptios hold. The the process (A (t t [,1] coverges weakly to a mea zero process (A(t t [,1], which is Gaussia coditioally o F ad whose coditioal covariace equals the oe of the process where U U[,1], idepedet of F. {α U (1 {U t} B t D 1 τ 2 (U,X U,σ 2 U} t [,1]

12 M. Vetter As idicated before, covergece of the fiite dimesioal distributios is a direct cosequece of Lemma 3.2, usig the Delta method for stable covergece (see, e.g., [11]. Tightess follows from Theorem VI. 4.5 i [2] with a miimal amout of work. Recall that N t = for all t uder the ull hypothesis. Therefore Theorem 3.3 shows that a cosistet test is obtaied by rejectig the ull hypothesis for large values of a suitable fuctioal of the process { /k ˆN t } t [,1]. If we choose the Kolmogorov Smirov fuctioal K =sup t [,1] /k ˆN t for example, we have weak covergece uder the ull to sup t [,1] A t as a cosequece of Theorem 3.3. The distributio of the latter statistic is extremely difficult to assess, as it typically depeds o the etire process (X,σ 2. We therefore propose to obtai critical values via a simple bootstrap procedure, which will be itroduced i the ext sectio. To ed this sectio, we defie a appropriate estimator for the coditioal variace of A(t, which is give by s 2 t = α 2 s ds 2B td 1 α 2 s τ2 (s,x s,σs 2 ds+b2 t D 2 α 2 s τ4 (s,x s,σs 2 ds, due to Theorem 3.3. Empirical couterparts for B t ad D are obviously defied by the statistics ˆB t ad ˆD, whereastheorem 2.9 suggeststhat a local estimator for α 2 i/ is give by ( ˆα 2 i/ = 2 453 k 2 28 (ˆσ2 (i+k / ˆσ2 i/ 4 486 35 ˆτ2 i/ˆσ4 i/ 6 346 k 5 1225 k j=1 i+jx 8. We obtai the followig result, which ca be prove i the same way as Theorem 2.9. Theorem 3.4. Let t be arbitrary ad set (ŝ t 2 = 1 t 2k i=1 + ˆB 2 t ˆD 21 The (ŝ t 2 is cosistet for s 2 t. ˆα 2 i/ 2ˆB t ˆD 1 1 t 2k i=1 t 2k i=1 ˆα 2 i/ τ2 ( i,x i/,ˆσ 2 i/ ˆα 2 i/ τ2 ( i,x i/,ˆσ 2 i/ As a cosequece, each statistic /k ˆN t /ŝ t coverges weakly to a ormal distributio. This result will be used to costruct a feasible bootstrap statistic i the followig. 4. Simulatio study. Let us start with a simulatio study cocerig the performace of the rate-optimal ˆR t as a estimator for itegrated volatility of volatility. Throughout this sectio, we will

Itegrated volatility of volatility 13 Table 1. Mea/variace ad simulated quatiles of the feasible test statistic (2.9 for ρ =. The last colum gives the relative amout of egative estimates Mea Variace.25.5.1.9.95.975 Neg. 4.397.856.497.968.1756.9754.9946.9989.3522 2 5.287.965.526.932.1619.9572.9862.9965.1963 1.17 1.23.449.799.1425.9325.9757.9928.933 22 5.112 1.2.44.696.1253.9271.9722.9914.51 4.73 1.29.41.73.1235.923.969.9874.3 52 9.31 1.22.368.653.1157.9154.9633.9872.221 work with the Hesto model oly, ad the parameters are chose as follows: β =.3, κ=5, α=.2 ad ξ=.5. Furthermore, we set X = ad σ 2 =α. Note that the Feller coditio 2κα ξ 2 is satisfied, which esuresthat the process σ 2 is almost surelypositive asrequested.sodoesτ 2,aditisobviousthat (2.1holdsaswell.Thereforeallcoditios from Sectio 2 are satisfied. We discuss the fiite sample properties of ˆR t for differet choices of the correlatio parameter ρ ad the umber of observatios, ad for comparability oly we take to be a square umber ad k equal to 1/2 i all cases, so we have c=1. Theorem 2.6 suggests that such a medium size of c is reasoable for fiite samples, ad additioal results ot reported here also poit towards the fact that k should be chose close to 1/2. Fially, we set t=1. Tables 1 6 below are based o 1 simulatios. Table 1 shows the performace for ρ=, for which we see that it takes quite some time for the asymptotics to kick i. Apparet is a slight overestimatioof the lower tails of the distributio, which seems to origiate from the relatio of the estimators ˆR 1 ad G (3 1,. By costructio, i cases where ˆR 1 is uderestimatig the true quatity, it is typically the case that icremets of ˆσ 2 are relatively small. As these icremets occur i G (3 1, as well, most likely the asymptotic variace is uderestimated as well, which explais a too large egative stadardised statistic. The same effect is visible for the upper quatiles as well (but resultig i a overestimatio, ad this simple explaatio is supported by a Table 2. Mea/variace ad simulated quatiles of the feasible test statistic (2.9 for ρ =.2. The last colum gives the relative amout of egative estimates Mea Variace.25.5.1.9.95.975 Neg. 4.386.874.491.967.1816.9724.9942.9989.3528 2 5.295.971.552.963.1614.9559.9864.9962.1996 1.176 1.13.464.88.1427.9369.977.994.954 22 5.226.987.48.84.1476.9436.9776.9932.557 4.75 1.1.41.673.1217.9254.9713.994.31 52 9.4 1.19.396.677.1171.918.9663.9879.246

14 M. Vetter Table 3. Mea/variace ad simulated quatiles of the feasible test statistic (2.8 for k = 3/4, l = 1/2 ad ρ= Mea Variace.25.5.1.9.95.975 4.49 1.261.966.134.1848.999 1 1 2 5.32.837.548.837.139.9963.9999 1 1.291.86.514.8.1332.9941 1 1 22 5.259.92.537.873.1424.9739.9948.9996 4.215 1.4.658.92.138.9654.997 1 52 9.164 1.16.594.826.1274.969.9988 1 detailed look at simulatio results ot reported here which reveal that the estimatio of the asymptotic variace is extremely accurate for moderate sizes of ˆR1 1 τ2 s ds, but becomes worse whe the deviatio is rather large. Similar coclusios ca be draw for the case of a moderately egative ρ=.2. We proceed with the fiite sample behaviour of the statistics ˆT t, for which we have a lot of freedom i choosig k ad l. However, i order for both M to be rather small ad the coditio 3/2 M 3/2 m 1 to be satisfied, we choose M = 3/4 ad m = 1/2, resultig i a rate of covergece of about 1/8. Also, we restrict ourselves to ρ=. As expected, the approximatio of the omial level is rather poor i this situatio, both whe reproducig mea/variace ad the quatiles i the tails. Empirically the results do ot improve for other choices of k ad l. Note from Table 3 ad Table 4 that results do ot differ very much whe choosig either k or l large, apart from the remarkable expectio of a larger l ad =1. But eve i this case, the results are ot better tha for the rate-optimal ˆR t, which is why we recommed to choose this oe rather tha ˆT t, eve though oly the latter estimator is esured to be positive. As a example for a applicatio i goodess-of-fit testig, we have costructed a test for a Hesto-like volatility structure via a bootstrap procedure as follows: Based o the Table 4. Mea/variace ad simulated quatiles of the feasible test statistic (2.8 for l = 3/4, k = 1/2 ad ρ= Mea Variace.25.5.1.9.95.975 4.476 1.255.976.1316.1889.9983.9999 1 2 5.311.817.55.779.1322.995.9996 1 1.149 1.196.45.657.15.8784.9589.994 22 5.276.812.46.728.1234.9886.9989 1 4.217 1.33.648.914.1354.9647.9974 1 52 9.36.824.494.829.1456.9882.9981 1

Itegrated volatility of volatility 15 Table 5. Simulated level of the bootstrap test based o the stadardised Kolmogorov Smirov statistic Y.1.25.5.1.2 4.4.12.24.64.172 2 5.18.4.64.12.216 1.1.18.4.84.194 22 5.16.24.34.88.194 4.2.38.68.128.22 52 9.1.2.52.118.2 observatio that for each t, /k ˆN t /ŝ t coverges weakly to a stadard ormal distributio if the ull is satisfied, it seems reasoable to reject the hypothesis for large values of the stadardised Kolmogorov Smirov statistic Y =sup i 2k /k ˆN i/ /ŝ i/. Sice its (asymptotic distributio is i geeral hard to assess, we used bootstrap quatilesistead,adpreciselywehavegeeratedbootstrapdatax (b i/,b=1,...,b,followig the equatio X t = σ s dw s, (σ t 2 = ˆα+ ˆκ(ˆα (σ s 2 ds+ ˆξ σ s dv s. Here, W ad V are idepedet Browia motios, ad we have idetified ˆα with the realised volatility of the origial data (which is a measure for the average volatility over [,1] ad defied ˆξ= ˆθ 1/2, sice both quatities coicide uder the ull. Fially, we have simply set ˆκ=5ˆθ/ˆα such that Feller s coditio is satisfied. Settig B =2, we have ru 5 simulatios each. Table 6. Simulated rejectio probabilities of the bootstrap test based o the stadardised Kolmogorov Smirov fuctioal statistic Y for various alteratives Alt γ = γ =2.1.25.5.1.2.1.25.5.1.2 4.32.72.124.192.292.56.8.128.24.32 2 5.28.52.82.134.262.44.9.156.248.372 1.32.48.86.138.26.36.84.176.284.396 22 5.24.42.68.138.32.32.86.162.284.432 4.28.46.94.196.426.28.64.12.31.482 52 9.26.4.82.174.422.24.58.144.32.488

16 M. Vetter Table 5 shows that the simulated levels are rather close to the expected oes, irrespectively of. We have tested two alteratives from the class of CEV models, amely ad σt 2 =σ2 +κ (α σs 2 ds+v t σt 2 =σ2 +κ (α σs 2 ds+ κ σ 2 s dv s, correspodig to γ = ad γ =2, respectively, ad usig the parameters from above. We see from the simulatio results that the rejectio probabilities are much larger for the secod alterative tha for the first, which ca partially explaied from two observatios: First, the Vasicek model does ot satisfy the assumptios from the previous sectios sice the volatilitymay become egative(i which case it is set to zero; secod,our choiceof ˆκ is resposible for a large speed of mea reversio i the bootstrap algorithm which makes it difficult to distiguish betwee a Hesto-like volatility of volatility ad a costat oe. It is expected that the power improves for a etirely data-drive choice of ˆκ. 5. Coclusio I this paper, we have discussed a oparametric method to estimate the itegrated volatility of volatility process i stochastic volatility models. Our cocept is based o spot volatility estimators, ad just as for stadard realised volatility we use sums of squares of these spot volatility estimators to obtai a global estimator for itegrated volatility of volatility. Two classes of estimators have bee ivestigated oe cosistig of positive estimators with a slow rate of covergece, the other oe beig bias corrected but covergig at the optimal rate 1/4. I both cases, cetral limit theorems are provided, ad we also discuss briefly why a trucated versio could be useful whe there are additioal jumps i the price process. Give the variety of stochastic volatility models (i cotiuous time which are used to describe fiacial data, there is a severe lack i tools o model validatio. Our results fill this gap to a first extet, as we provide a bootstrap method for goodess-of-fit testig i such models which ivestigates whether a specific parametric model for volatility of volatility is appropriate give the data or ot. A rigorous proof that the proposed procedure keeps the asymptotic level ad is cosistet agaist a large class of alteratives has ot bee provided, however, ad is left for future research. A differet issue to take microstructure issues ito accout which are likely to be preset whe data is observed at high-frequecy. Agai it is promisig to combie filterig methods for oisy diffusios with the method proposed i this paper to obtai a estimator for itegrated volatility of volatility i such models as well, but the rate of covergece is expected to drop further. Precise statemets are beyod the scope of the paper as well.

Itegrated volatility of volatility 17 Appedix Note first that every left-cotiuous process is locally bouded, thus all processes appearig are. Secod, stadard localisatio procedures as i [5] or [17] allow us to assume that ay locally bouded process is actually bouded, ad that almost surely positive processes ca be regarded as bouded away from zero. Uiversal costats are deoted by C or C r, the latter if we wat to emphasise depedece o some additioal parameter r. Withi the mai corpus, we give the proof of Theorem 2.6 oly, which is the by far most complicated result of this work. Aalogues of Lemma 2.2 ad Propositio 2.3 for the special case of l =k are of course parts of it, ad it is ot difficult to geeralise the proofs i order for both claims to be covered as well. Therefore, these results are ot show explicitly. Let us start with a brief sketch of what we will be doig. I geeral, F- stable covergece of a sequece Z to some limitig variable Z defied o a extesio ( Ω, F, P of the origial space is equivalet to E[h(Z Y] Ẽ[h(ZY] (A.1 for ay bouded Lipschitz fuctio h ad ay bouded F-measurable Y. For details, see, for example, [2] ad related work. Suppose ow that there are additioal variables Z,p ad Z p (the latter defied o the same extesio as Z such that lim limsup E[ Z Z,p ] =, p (A.2 Z,p L (s Z p for all p, (A.3 lim Ẽ[ Z p Z ] =, p (A.4 hold. The the desired stable covergece Z L (s Z follows. Ideed, let ε >. The there exists a δ > such that x y <δ implies h(x h(y <ε. Thus we have E[h(Z Y] E[h(Z,p Y] C(E[h(Z h(z,p 1 { Z Z,p δ}]+e[ h(z h(z,p 1 { Z Z,p <δ}] C(P( Z Z,p δ+ε. We have lim p limsup E[h(Z Y] E[h(Z,p Y] = from Markov iequality, (A.2 ad as ε was arbitrary. lim p Ẽ[h(ZpY] Ẽ[h(ZY] = ca be show similarly usig (A.4, ad (A.3 is by defiitio equivalet to lim E[h(Z,p Y] Ẽ[h(Z py] =. Puttig the latter three claims together (plus the triagle iequality ad the fact that all three limitig coditios o p ad are actually the same gives (A.1. Our aim i this proof is to employ a certai blockig techique, which allows us to make use of a type of coditioal idepedece betwee the summads withi ˆR t. To this ed, we apply the above methodology, so we have to defie a appropriate double

18 M. Vetter sequece U,p t, which will correspod to a approximated versio of ˆR t where we sum over the big blocks oly. Some additioal otatio is ecessary. Let p N be arbitrary. We set a l (p=(l 1(p+2k, b l (p=a l (p+pk, c(p=j (p(p+2k +1, the first two for ay,...,j (p with J (p= t 2k /((p+2k. These umbers deped o as well, eve though it does ot show up i the otatio. We defie further Hi = i/ (i 1/ (W s W (i 1/ dw s. I order to exploit the afore-metioed coditioal idepedece, we eed approximatios for A i ad Bi from (2.3. For the sake of brevity, we will oly state the approximated icremets explicitly, which are give by Ã (i+k/ Ãi/ := k 2σa 2 k l (p/ (H i+j+k Hi+j j=1 = k σ 2 a l (p/ k j=1 (( i+k +jw 2 ( i+jw 2, (A.5 where the latter idetity is a cosequece of Itô formula, ad B (i+k/ B i/ := (i+k/ (β al (p/(w s+k/ W s +η al (p/(w s+k k / W s ds. i/ (A.6 These quatities are defied for i=a l (p,...,b l (p 1, thus over the big blocks. For later reasos, we itroduce similar approximatios over the small blocks. Set C (i+k/ C i/ = k σ 2 b l (p/ k j=1 (( i+k +j W2 ( i+j W2, D (i+k/ D i/ = (i+k/ (β bl (p/(w s+k/ W s +η bl (p/(w s+k k / W s dsds, i/ both for i=b l (p,...,a l+1 (p 1. The the followig claim holds, whose proof is postpoed to the supplemetal file [26]. Lemma A.1. We have E[ A (i+k/ A i/ (Ã(i+k / Ãi/ r ] C r (p 1 r/2, E[ B (i+k/ B i/ ( B (i+k/ B i/ r ] C r (p 1 r/2,

Itegrated volatility of volatility 19 as well as E[ A (i+k/ A i/ r ] C r r/4 ad E[ B (i+k/ B i/ r ] C r r/4 for every r >. The latter bouds hold also for the approximated versios, ad the same results are true for the approximatio via icremets of C ad D over the small blocks. Up to a differet stadardisatio, the role of Z,p i this proof will be played by U,p t = J (p U,p l, where b l (p 1 U,p 3 l = ((Ã(i+k 2k / Ãi/+( B (i+k/ B i/ 2 i=a l (p pk [ 6 ] k 2 σa 4 l (p/ +(β2 a l (p/ +η2 a l (p/ (A.7 ivolves quatities from the big blocks oly. The U,p l ca be show to be martigale differeces, ad the most ivolved part i the proof is to use Lemma A.1 to obtai [ t lim limsup E (ˆR t p k (βs 2 +η2 s ds U,p t ] =, (A.8 which is the aalogue of (A.2. Let us focus o the remaiig two steps as well. We set U p t = α(p s dw s, α(p 2 s = p ( 48p+d1 p+2 pc 4 σs 8 + 12p+d 2 pc 2 σs(β 4 s 2 +ηs+ 2 151p+d 3 (βs 2 +η 2 7p s 2 for certai uspecified costats d l,,2,3. I order to prove the stable covergece U,p t k L (s U p t (A.9 we use a well-kow result for triagular arrays of martigale differeces, which is due to Jacod [16]. I particular, the followig three coditios have to be checked. k J (p E a l (p [U,p k J (p E a l (p [(U,p 2 J (p k 2 E a l (p [(U,p P l 2 ] P l 4 ], l (N al+1 (p/ N al (p/], P α(p 2 s ds, (A.1 (A.11 (A.12

2 M. Vetter where N is ay compoet of (W,W or a bouded martigale orthogoal to both W ad W. The fial step lim p Ẽ U p t U t = is obvious. A.1. Proof of (A.8 For simplicity, we set η ad ϑ (2 from ow o, as otherwise the proof is exactly the same. I a brief first step, we replace ˆR t by a versio i which the ukow bias ad ot the estimator for it is subtracted, that is we itroduce t 2k Ut 3 = (ˆσ (i+k 2 2k / ˆσ2 i/ 2 6 c 2 i= σ 4 s ds β 2 s ds. Theorem 2.1 i [5] shows that itegrals over σ ca be estimated with rate 1/2, so the assumptio o k ad a stadard argumet regardig boudary terms prove that [ t ] E (ˆR t k βs U 2 ds t =o(1, uiformly i t. A simple cosequece of Lemma A.1 is that the remaider terms i U t are egligible, that is [ lim limsup E p k t 2k i=c(p 3 (ˆσ (i+k 2 2k / ˆσ2 i/ 2 6 ] t c 2 σs 4 ds βs 2 ds =, c(p/ c(p/ usig also boudedess of the processes o the right had side ad the defiitio of c(p. Therefore, we are left to show with Ũ,p t = lim limsup /k p E[ Ũ,p t U,p t ] = J (p ( bl (p 1 i=a l (p + a l+1 (p 1 i=b l (p (ˆσ 2 (i+k / ˆσ2 i/ 2 6 c(p/ c(p/ c 2 σs 4 ds βs 2 ds. (A.13 (A.14 For the itegrals withi (A.14, recall that these are replaced by approximated versios i U,p t. Therefore we have to show for example, [ J lim limsup (p bl (p/ E (β p s 2 βa 2 k l (p/ ]=. ds (A.15 a l (p/

Itegrated volatility of volatility 21 For its proof, recall (2.1. The result above follows from ad E E[ J (p bl (p/ s ( J(p bl (p/ s a l (p/ a l (p/ a l (p/ Of course, the similar claim a l (p/ [ lim limsup E p k ] ω r drds C pk 2 J (p ϑ (1 r dw r ds = E Cp 2 1. J (p bl (p/ a l (p/ ( 2 pk Cp 1/2 ( bl (p/ s a l (p/ a l (p/ 2 ϑ (1 r dw r ds (σ 4 s σ4 a l (p/ ds ]= (A.16 holds for the same reasos. We have further [ J lim limsup (p ( ] pk E p k k 2 1 σ 4 c 2 a l (p/ =, (A.17 which by boudedess of σ amouts to prove 3/4 (k 2 c2 =o(1, ad the latter is satisfiedbydefiitio ofk.note that aaloguesof(a.15, (A.16 ad(a.17 aresatisfied over the small blocks as well. The latter claims prove that we are left to show the approximatio over the big blocks, which is lim limsup E p k [ J (p b l (p 1 i=a l (p ad the egligibility of the small blocks, that is 3 2k (((A (i+k/ A i/ +(B (i+k/ B i/ 2 (A.18 ] ((Ã(i+k/ Ãi/+( B (i+k/ B i/ 2 =, [ lim limsup E p k J (p ( al+1 (p 1 i=b l (p 2k 3 2k (ˆσ 2 (i+k / ˆσ2 i/ 2 [ 6 k 2 σ(b 4 l (p/ +β2 b l (p/] ] = (A.19 to obtai (A.8.

22 M. Vetter To prove (A.18, the biomial theorem tells us that we ca discuss the approximatio for B, the oe for A ad the mixed part separately. Usig further x 2 y 2 =2y(x y+ (x y 2 adxx yy =(x yy +y(x y +(x y(x y,weseefromlemmaa.1ad the growth coditios that (A.18 follows from lim p limsup 4 r=1 E[ L(j,p ] = with L (1,p = k L (2,p = k L (3,p = k L (4,p = k J (p J (p J (p J (p b l (p 1 i=a l (p b l (p 1 i=a l (p b l (p 1 i=a l (p b l (p 1 i=a l (p 1 k ((B (i+k/ B i/ ( B (i+k/ B i/ ( B (i+k/ B i/, 1 k ((B (i+k/ B i/ ( B (i+k/ B i/ (Ã(i+k / Ãi/, 1 k ((A (i+k/ A i/ (Ã(i+k / Ãi/(Ã(i+k / Ãi/, 1 k ((A (i+k/ A i/ (Ã(i+k / Ãi/( B (i+k/ B i/. (A.2 (A.21 (A.22 (A.23 Proofs of these claims ca be foud i the supplemetary material [26]. Fially, to obtai (A.19, we compute the coditioal expectatio of the approximated icremets, ad we will do this for the Ã ad B terms oly. We have E a l (p [(Ã(i+k / Ãi/ 2 ] = 2 k 2 k σa 4 l (p/ j=1 = 4 k σ 4 a l (p/ E[(( i+k +jw 2 ( i+k W 2 2 ] (A.24 as well as E a l (p [( B (i+k/ B i/ 2 ] =2 2 k 2 βa 2 l (p/ (i+k/ s i/ i/ E[(W s+k/ W s (W r+k/ W r ]drds

Itegrated volatility of volatility 23 =2 2 k 2 βa 2 l (p/ = 2k 3 β2 a l (p/. (i+k/ s i/ i/ ( r+k / s drds The expectatio of the mixed part is zero. Obviously, we have E b l (p [( al+1 (p 1 i=b l (p 2k 3 2k (( C (i+k/ C i/ +( D (i+k/ D i/ 2 [ 6 k 2 σb 4 l (p/ +β2 b l (p/] ] = as well. (A.19 the follows from the fact that k J (p E [( al+1 (p 1 i=b l (p 3 2k (( C (i+k/ C i/ +( D (i+k/ D i/ 2 2k [ ] 2] 6 k 2 σb 4 l (p/ +β2 b l (p/ is bouded by a costat times p 1, usig Lemma A.1. A.2. Proof of (A.9 Let us check the coditios for stable covergece i this step, where particularly the proof of (A.1 is tedious. Write U,p l = 3 s=1 U,p,s l with b l (p 1 U,p,1 l = i=a l (p b l (p 1 U,p,2 l = i=a l (p 3 2k 3 2k ( (Ã(i+k/ Ãi/ 2 4 σa 4 k l (p/, ( ( B (i+k/ B i/ 2 2k 3 β2 a l (p/, b l (p 1 U,p,3 3 l = (Ã(i+k k / Ãi/( B (i+k/ B i/. i=a l (p We have see i the fial step above that these terms are ideed martigale differeces, ad it turs out that olythe (U,p,s l 2 terms areresposiblefor the coditioalvariace, whereas the remaiig mixed oes are of small order each. To summarize, the followig lemma holds which is prove i the supplemetary material [26].

24 M. Vetter Lemma A.2. We have E a l (p [(U,p,1 l 2 ]=σa 8 48p+d 1 l (p/ k 2 +O P (p 3/2, ( E a l (p [(U,p,2 l 2 151 k ]=βa 4 2 l (p/ 7 p+d 2 2 +O P(p 3/2, E a l (p [(U,p,3 l 2 ]=σa 4 12p+d 3 l (p/ β2 a l (p/ +O P (p 3/2, for certai uspecified costats d m, m=1,2,3, as well as for each r s We use Lemma A.2 to obtai k J (p = pk E a l (p [(U,p l 2 ] ( 2 J (p k 4 ( 48+ d 1 p + k 2 E a l (p [U,p,r l U,p,s l ]=O P (p 3/2. σ 8 a l (p/ ( 12+ d ( 2 151 σa 4 p l (p/ β2 a l (p/ + 7 + d ( 3 1 βa 4 p l (p/ +O p, 1/2 thus (A.1 holds usig k c 1/2. Simpler to obtai is (A.11, as Lemma A.1 gives 2 J (p k 2 E a l (p [(U,p l 4 ] C 3 pk 3 p 4 2, which coverges to zero i the usual sese. Fially, oe ca prove E a l (p [ bl (p 1 i=a l (p 3 2k ((Ã(i+k / Ãi/ +( B (i+k/ B i/ 2 (N al+1 (p/ N al (p/ ] =, (A.25 where N is either W or W or whe N is a bouded martigale, orthogoal to (W,W.Focusothefirstcaseaddecompose((Ã(i+k / Ãi/+( B (i+k/ B i/ 2 via the biomial theorem. For the pure Ã ad the pure B term, the claim follows immediately from properties of the ormal distributio upo usig that σ al (p/ or β al (p/ are F al (p/ measurable. For the mixed term, oe has to use the special form

Itegrated volatility of volatility 25 of Ã (i+k/ Ãi/ as a differece of two sums, ad a symmetry argumet proves (A.25 i this case. For a orthogoal N, we use stadard calculus. By Itô formula, both (Ã(i+k / Ãi/ 2 ad ( B (i+k/ B i/ 2 are a measurable variable times the sum of a costat ad a stochastic itegral with respect to W ad W, respectively. Thus (A.25 holds. I the mixed case, we use itegratio by parts formula to reduce (Ã(i+k / Ãi/( B (i+k/ B i/ tothe sum ofacostat,adw- ad a dw -itegral. The the same argumet applies. Altogether, this gives (A.12. Ackowledgemets The author is grateful for fiacial support through the collaborative research ceter Statistical modelig of oliear dyamic processes (SFB 823 of the Germa Research Foudatio (DFG. Special thaks go to two aoymous referees for their valuable commets o earlier versios of this paper. Supplemetary Material Additioal proofs for claims made i the article (DOI: 1.315/14-BEJ648SUPP;.pdf. We provide several proofs for either theorems from the mai corpus or additioal steps discussed i the Appedix. Refereces [1] Aït-Sahalia, Y. ad Kimmel, R. (27. Maximum likelihood estimatio of stochastic volatility models. J. Fiacial Ecoomics 134 57 551. [2] Alvarez, A., Paloup, F., Potier, M. ad Savy, N. (212. Estimatio of the istataeous volatility. Stat. Iferece Stoch. Process. 15 27 59. MR2892587 [3] Badi, F. ad Reò, R. (28. Noparametric stochastic volatility. Techical report. [4] Bardorff-Nielse, O. ad Veraart, A. (29. Stochastic volatility of volatility i cotiuous time. Techical report. [5] Bardorff-Nielse, O.E., Graverse, S.E., Jacod, J., Podolskij, M. ad Shephard, N. (26. A cetral limit theorem for realised power ad bipower variatios of cotiuous semimartigales. I From Stochastic Calculus to Mathematical Fiace 33 68. Berli: Spriger. MR2233534 [6] Bardorff-Nielse, O.E., Hase, P.R., Lude, A. ad Shephard, N. (211. Multivariate realised kerels: Cosistet positive semi-defiite estimators of the covariatio of equity prices with oise ad o-sychroous tradig. J. Ecoometrics 162 149 169. MR279561 [7] Bollerslev, T. ad Zhou, H. (22. Estimatig stochastic volatility diffusio usig coditioal momets of itegrated volatility. J. Ecoometrics 19 33 65. MR1899692 [8] Cherov, M. ad Ghysels, E. (2. Estimatio of stochastic volatility models for the purpose of optio pricig. I Computatioal Fiace 1999 (Y. Abu-Mostafa, B. LeBaro, A. Lo ad A. Weiged, eds. 567 581. Cambridge: MIT Press.

26 M. Vetter [9] Comte, F., Geo-Catalot, V. ad Rozeholc, Y. (21. Noparametric estimatio for a stochastic volatility model. Fiace Stoch. 14 49 8. MR256325 [1] Dette, H. ad Podolskij, M. (28. Testig the parametric form of the volatility i cotiuous time diffusio models A stochastic process approach. J. Ecoometrics 143 56 73. MR2384433 [11] Dette, H., Podolskij, M. ad Vetter, M. (26. Estimatio of itegrated volatility i cotiuous-time fiacial models with applicatios to goodess-of-fit testig. Scad. J. Stat. 33 259 278. MR2279642 [12] Geo-Catalot, V., Jeatheau, T. ad Laredo, C. (1999. Parameter estimatio for discretely observed stochastic volatility models. Beroulli 5 855 872. MR1715442 [13] Gloter, A. (27. Efficiet estimatio of drift parameters i stochastic volatility models. Fiace Stoch. 11 495 519. MR2335831 [14] Hesto, S. (1993. A closed-form solutio for optios with stochastic volatility with applicatios to bods ad currecy optios. Rev. Fiacial Studies 6 327 343. [15] Hoffma, M. (22. Rate of covergece for parametric estimatio i a stochastic volatility model. Stochastic Process. Appl. 97 147 17. MR187964 [16] Jacod, J. (1997. O cotiuous coditioal Gaussia martigales ad stable covergece i law. I Sémiaire de Probabilités XXXI. Lecture Notes i Math. 1655 232 246. Berli: Spriger. MR1478732 [17] Jacod, J. (28. Asymptotic properties of realized power variatios ad related fuctioals of semimartigales. Stochastic Process. Appl. 118 517 559. MR2394762 [18] Jacod, J. ad Protter, P. (212. Discretizatio of Processes. Stochastic Modellig ad Applied Probability 67. Heidelberg: Spriger. MR285996 [19] Jacod, J. ad Rosebaum, M. (213. Quarticity ad other fuctioals of volatility: Efficiet estimatio. A. Statist. 41 1462 1484. MR3113818 [2] Jacod, J. ad Shiryaev, A.N. (23. Limit Theorems for Stochastic Processes, 2d ed. Grudlehre der Mathematische Wisseschafte [Fudametal Priciples of Mathematical Scieces] 288. Berli: Spriger. MR1943877 [21] Joes, C.S. (23. The dyamics of stochastic volatility: Evidece from uderlyig ad optios markets. J. Ecoometrics 116 181 224. Frotiers of fiacial ecoometrics ad fiacial egieerig. MR22525 [22] Macii, C. (29. No-parametric threshold estimatio for models with stochastic diffusio coefficiet ad jumps. Scad. J. Stat. 36 27 296. MR2528985 [23] Podolskij, M. ad Vetter, M. (21. Uderstadig limit theorems for semimartigales: A short survey. Stat. Neerl. 64 329 351. MR2683464 [24] Reò, R. (26. Noparametric estimatio of stochastic volatility models. Ecoom. Lett. 9 39 395. MR2212176 [25] Vetter, M. (212. Estimatio of correlatio for cotiuous semimartigales. Scad. J. Stat. 39 757 771. MR3847 [26] Vetter, M. (214. Supplemet to Estimatio of itegrated volatility of volatility with applicatios to goodess-of-fit testig. DOI:1.315/14-BEJ648SUPP. [27] Vetter, M. ad Dette, H. (212. Model checks for the volatility uder microstructure oise. Beroulli 18 1421 1447. MR299583 [28] Wag, C.D. ad Myklad, P.A. (214. The estimatio of leverage effect with highfrequecy data. J. Amer. Statist. Assoc. 19 197 215. MR318557 Received July 212 ad revised March 214