Optimally Thresholded Realized Power Variations for Lévy Jump Diffusion Models José E. Figueroa-López 1 1 Department of Statistics Purdue University University of Missouri-Kansas City Department of Mathematics and Statistics April 5th, 2012 (Joint work with Jeff Nisen)
Outline 1 The Statistical Problems and the Main Estimators 2 Optimally Thresholded Power Estimators 3 Main Results 4 Extensions 5 Conclusions
Set-up 1 Continuous-time stochastic process t X t with dynamics where dx t = γ t dt + σ t dw t + dj t, t W t is a standard Brownian motion; t J t := N t j=1 ζ j is a piece-wise constant process of finite jump activity; t γ t and t σ t are adapted processes; 2 Finite-jump activity Lévy model: N t X t = γt + σw t + ζ j, where {N t } t 0 is a homogeneous Poisson process with jump intensity λ, {ζ j } j 0 are i.i.d. with density f ζ : R R +, and the triplet ({W t }, {N t }, {ξ j }) are mutually independent. j=1
Statistical Problems Given a discrete record of observations, X t0, X t1,..., X tn, from the process during a fixed finite time-horizon [0, T ], the following problems are of interest: 1 Estimating the integrated variance (or quadratic variation): σ 2 T = T 0 σ 2 t dt. 2 Estimating the jump features of the process: Jump times τ 1 < τ 2 < < τ NT Jump sizes ζ 1 < ζ 2 < < ζ NT 3 Jump detection during a given time interval [s, t] [0, T ]
Two main classes of estimators Precursor. Realized Quadratic Variation: n 1 ( QV (X) π := Xti+1 X ) 2 t i, (π : 0 = t0 < < t n = T ). i=0 Under very general conditions: RV (X) π mesh(π) 0 σ 2 T + N T j=1 ζ2 j. 1 Multipower Realized Variations (Barndorff-Nielsen and Shephard (2004)): n 1 BPV (X) π := Xti+1 X ti Xti+2 X ti+1, MPV (X) (r 1,...,r k ) π := i=0 n k X ti+1 X ti r 1... X ti+k X ti+k 1 r k. i=0 2 Threshold Realized Variations (Mancini (2003)): n 1 ( TRV (X)[B] π := Xti+1 X ) 2 t i 1{ Xti+1 Xti B}, (B (0, )). i=0
Advantages and Drawbacks 1 Multipower Realized Variations (MPV) Easy to implement; Exhibit high" bias in the presence of jumps: ] E [MPV (X) (r 1,...,r k ) C r σ T 2 2 Threshold Realized Variations (TRV): π tc r, with Cr = n 1 2 1 max i r i Can be adapted for estimating other jump features: k i=1 E Z i r i r 1 + + r k = 2. n 1 n 1 ( ) N[B] π := 1 { 2 >B} Xti+1, Ĵ[B] X π := Xti+1 X ti 1 { ti Xti+1 Xti i=0 i=0 } >B Its performance strongly depends on a good" choice of the threshold level B; e.g., given a sequence π n of sampling schemes with mesh(π n) 0, L TRV (X)[B] 2 πn σ 2 T B(π n) 0, Ad-hoc thresholds proposed in the literature: B(π n) mesh(πn). B(π) := α mesh(π) ω, B n(π) = α mesh 1 2 Φ 1 (1 β mesh(π))
Advantages and Drawbacks 1 Multipower Realized Variations (MPV) Easy to implement; Exhibit high" bias in the presence of jumps: ] E [MPV (X) (r 1,...,r k ) C r σ T 2 2 Threshold Realized Variations (TRV): π tc r, with Cr = n 1 2 1 max i r i Can be adapted for estimating other jump features: k i=1 E Z i r i r 1 + + r k = 2. n 1 n 1 ( ) N[B] π := 1 { 2 >B} Xti+1, Ĵ[B] X π := Xti+1 X ti 1 { ti Xti+1 Xti i=0 i=0 } >B Its performance strongly depends on a good" choice of the threshold level B; e.g., given a sequence π n of sampling schemes with mesh(π n) 0, L TRV (X)[B] 2 πn σ 2 T B(π n) 0, Ad-hoc thresholds proposed in the literature: B(π n) mesh(πn). B(π) := α mesh(π) ω, B n(π) = α mesh 1 2 Φ 1 (1 β mesh(π))
Advantages and Drawbacks 1 Multipower Realized Variations (MPV) Easy to implement; Exhibit high" bias in the presence of jumps: ] E [MPV (X) (r 1,...,r k ) C r σ T 2 2 Threshold Realized Variations (TRV): π tc r, with Cr = n 1 2 1 max i r i Can be adapted for estimating other jump features: k i=1 E Z i r i r 1 + + r k = 2. n 1 n 1 ( ) N[B] π := 1 { 2 >B} Xti+1, Ĵ[B] X π := Xti+1 X ti 1 { ti Xti+1 Xti i=0 i=0 } >B Its performance strongly depends on a good" choice of the threshold level B; e.g., given a sequence π n of sampling schemes with mesh(π n) 0, L TRV (X)[B] 2 πn σ 2 T B(π n) 0, Ad-hoc thresholds proposed in the literature: B(π n) mesh(πn). B(π) := α mesh(π) ω, B n(π) = α mesh 1 2 Φ 1 (1 β mesh(π))
Advantages and Drawbacks 1 Multipower Realized Variations (MPV) Easy to implement; Exhibit high" bias in the presence of jumps: ] E [MPV (X) (r 1,...,r k ) C r σ T 2 2 Threshold Realized Variations (TRV): π tc r, with Cr = n 1 2 1 max i r i Can be adapted for estimating other jump features: k i=1 E Z i r i r 1 + + r k = 2. n 1 n 1 ( ) N[B] π := 1 { 2 >B} Xti+1, Ĵ[B] X π := Xti+1 X ti 1 { ti Xti+1 Xti i=0 i=0 } >B Its performance strongly depends on a good" choice of the threshold level B; e.g., given a sequence π n of sampling schemes with mesh(π n) 0, L TRV (X)[B] 2 πn σ 2 T B(π n) 0, Ad-hoc thresholds proposed in the literature: B(π n) mesh(πn). B(π) := α mesh(π) ω, B n(π) = α mesh 1 2 Φ 1 (1 β mesh(π))
Advantages and Drawbacks 1 Multipower Realized Variations (MPV) Easy to implement; Exhibit high" bias in the presence of jumps: ] E [MPV (X) (r 1,...,r k ) C r σ T 2 2 Threshold Realized Variations (TRV): π tc r, with Cr = n 1 2 1 max i r i Can be adapted for estimating other jump features: k i=1 E Z i r i r 1 + + r k = 2. n 1 n 1 ( ) N[B] π := 1 { 2 >B} Xti+1, Ĵ[B] X π := Xti+1 X ti 1 { ti Xti+1 Xti i=0 i=0 } >B Its performance strongly depends on a good" choice of the threshold level B; e.g., given a sequence π n of sampling schemes with mesh(π n) 0, L TRV (X)[B] 2 πn σ 2 T B(π n) 0, Ad-hoc thresholds proposed in the literature: B(π n) mesh(πn). B(π) := α mesh(π) ω, B n(π) = α mesh 1 2 Φ 1 (1 β mesh(π))
Numerical illustration 0.080 0.085 0.090 0.095 0.100 0.105 0.110 0.115 Diffusion Volatility Parameter (DVP) Estimates RMPV(1, 1) RMPV(2 3, 2 3, 2 3) RMPV(1 2,, 1 2) RMPV(2 5,, 2 5) RMPV(1 3,, 1 3) Min RV(2) Med RV(2) TBPV(Pow(0.05)) TBPV(Pow(0.15)) TBPV(Pow(0.25)) TBPV(Pow(0.35)) TBPV(Pow(0.45)) TBPV(Pow(0.495)) TBPV(B opt) TBPV(BF(0.05)) TBPV(BH(0.05)) Merton Model: Diffusion Volatility Parameter Estimates Actual DVP Multi Power Variation Style Estimators Thresholded Multi Power Style Estimators Multiple Testing Style Estimators Figure: Box Plots of MC numerical experiments (2500 simulations) based on T = 1 year, 5 min sample observations. Parameters: σ = 0.3, λ = 20, f ζ N ( 0.1, 0.1 2 ).
Optimal Threshold Realized Estimators 1 Aims Develop a well-posed optimal selection criterion for the threshold B, that minimizes a suitable statistical loss function of estimation. Characterize the optimal threshold B asymptotically when mesh(π) 0. Developed a feasible implementation method for the optimal threhold sequence. 2 Assumptions Finite activity Lévy model: X t = γt + σw t + N t j=1 ζ i.i.d. j, ζ j f ζ. Regular sampling scheme with mesh h n := 1 ; i.e., π : t n i = i. n The jump density function f ζ takes the mixture form: 3 Notation: f ζ (x) = pf +(x)1 {x 0} + qf ( x)1 {x<0} with p + q = 1, f ± : [0, ) R + C 1 b(0, ) C(f ζ ) := pf +(0) + qf (0). Φ and φ are the cdf and pdf of a standard Normal variable, respectively. n i X := i X := X ti X ti 1, n i N := i N := N ti N ti 1
Optimal Threshold Realized Estimators 1 Aims Develop a well-posed optimal selection criterion for the threshold B, that minimizes a suitable statistical loss function of estimation. Characterize the optimal threshold B asymptotically when mesh(π) 0. Developed a feasible implementation method for the optimal threhold sequence. 2 Assumptions Finite activity Lévy model: X t = γt + σw t + N t j=1 ζ i.i.d. j, ζ j f ζ. Regular sampling scheme with mesh h n := 1 ; i.e., π : t n i = i. n The jump density function f ζ takes the mixture form: 3 Notation: f ζ (x) = pf +(x)1 {x 0} + qf ( x)1 {x<0} with p + q = 1, f ± : [0, ) R + C 1 b(0, ) C(f ζ ) := pf +(0) + qf (0). Φ and φ are the cdf and pdf of a standard Normal variable, respectively. n i X := i X := X ti X ti 1, n i N := i N := N ti N ti 1
Optimal Threshold Realized Estimators 1 Aims Develop a well-posed optimal selection criterion for the threshold B, that minimizes a suitable statistical loss function of estimation. Characterize the optimal threshold B asymptotically when mesh(π) 0. Developed a feasible implementation method for the optimal threhold sequence. 2 Assumptions Finite activity Lévy model: X t = γt + σw t + N t j=1 ζ i.i.d. j, ζ j f ζ. Regular sampling scheme with mesh h n := 1 ; i.e., π : t n i = i. n The jump density function f ζ takes the mixture form: 3 Notation: f ζ (x) = pf +(x)1 {x 0} + qf ( x)1 {x<0} with p + q = 1, f ± : [0, ) R + C 1 b(0, ) C(f ζ ) := pf +(0) + qf (0). Φ and φ are the cdf and pdf of a standard Normal variable, respectively. n i X := i X := X ti X ti 1, n i N := i N := N ti N ti 1
Loss Functions 1 Natural Loss Function Loss (1) n (B) := E [ TRV (X)[B]n T σ 2 2] + E [ N[B]n N T 2 ]. 2 Alternative Loss Function 3 Interpretation Loss (2) n ( B) := E nt i=1 ( ) 1 [ n i X >B, n N=0] + 1 i [ n i X B, n N 0]. i Loss (1) n (B) favors sequences that minimizes the estimation errors of both the continuous and the jump component. Loss (2) n (B) favors sequences that minimizes the total number of miss-classifications: flag jump when there is no jump and fail to flag jump when there is a jump. Loss (2) n (B) is much more tractable than Loss (1) n (B).
Asymptotic Comparison of Loss Functions Theorem (FL & Nisen (2013)) Given a threshold sequence (B n ) n satisfying B n 0 and B n n, there exists a positive sequence (C n ) n, with lim n C n = 0, such that Loss (2) n (B) + R n (B) Loss (1) n (B) (1 + C n (B))Loss (2) n (B) + R n (B) + R n (B), where, as n, R n (B) T ( ( ) 2 λ2 2σ n 2n + T 2 nbn φ λb n C(f )), B n σ [ ] 6T σ 4 R n (B) + 3B 6 n nt 2 λ 2 C(f ) 2. Furthermore, lim n inf B>0 Loss(1) n (B) = 1. (1) inf B>0 Loss(2) n (B)
Well-posedness and asymptotic characterization Theorem (FL & Nisen (2013)) There exists an N N such that for all n N, the loss function Loss (2) n (B) is quasi-convex and possesses a unique global minimum Bn: Bn := arg inf B>0 Loss (2) n (B). Furthermore, the optimal threshold sequence (Bn) n is such that Bn 3σ2 ln(n) ( ) = + o ln(n)/n, (n ). n
Remarks 1 The leading term of the optimal sequence is proportional to the Lévy modulus of Brownian motion: lim sup h 0 1 2h ln(1/h) 2 The leading order sequence sup W t W s = 1, a.s. t s <h,s,t [0,1] B,1 n := 3σ2 ln(n), n provides a blueprint" to look for a suitable threshold sequence. 3 Merton Model: ζ N (0, δ 2 ). With αn 2 := σ2 n + δ2, ( ) Bn = 3σ2 ln(n) 2σ 2 σλ ln α n + 3σ4 ln(n) n n n 2 δ 2 2σ4 ln(σλ/α n ) n 2 δ 2 4 The performance of the leading term (compared to the optimal threshold) will depend on the quantity: σλ δ. 1/2.
A Feasible Iterative Algorithm to Find B n 1 Key Issue: The optimal threshold B would allow us to find an optimal estimate ˆσ for σ 2 of the form but B depends on precisely σ 2. Set σ 2 n,0 := 1 T ˆσ 2 := 1 T TRV (X)[B (σ 2 )] n, 2 The previous issue suggests a fixed-point type of implementation: ( ) 1/2 nt i=1 X t i X ti 1 2 and B n,0 := while σ 2 n,k 1 > σ2 n,k do σ 2 n,k+1 1 T TRV (X)[ B n,k ] n and B n,k+1 ( 3 σ 2 n,0 ln(n) n ) 1/2 3 σ 2 n,k+1 ln(n) n end while { } Let kn := inf k 1 : σ n,k+1 2 = σ2 n,k and take σ n,k 2 as the final n estimate for σ and the corresponding B n,k as an estimate for n B n. 3 The previous algorithm generates a non-increasing sequence of estimators { σ 2 n,k } k and finish in finite time.
A numerical illustration Merton Model: 4-year / 1-day σ = 0.3 λ = 5 µ = 0, δ = 0.6 Method TRV S TRV Loss S Loss B n,k n 0.2985 0.0070 2.0588 1.4267 Pow 0.2967 0.0066 2.2992 1.4972 BF 0.2983 0.0071 2.1756 1.4749 Table: Finite-sample performance of the threshold realized variation (TRV) estimators i.i.d. based on K = 5, 000 sample paths for the Merton model ζ i N (µ, δ 2 ). Loss represents the total number of Jump Misclassification Errors, while TRV, Loss, S TRV, and S Loss denote the corresponding sample means and standard deviations, respectively.
A numerical illustration (S2) Kou Model: 1-week / 5-minute σ = 0.5 λ = 50 p = 0.45, α + = 0.05, α = 0.1 Method TRV S TRV Loss S Loss B n,k n 0.5004 0.0186 0.2232 0.4706 Pow 0.4407 0.0142 13.5302 3.6392 BF 0.4917 0.0193 1.180 1.0775 Table: Finite-sample performance of the threshold realized variation (TRV) estimators based on K = 5, 000 sample paths for the Kou model: f Kou (x) = p α + e x/α+ 1 [x 0] + (1 p) α e x /α 1 [x<0]. Loss represents the total number of Jump Misclassification Errors, while TRV, Loss, S TRV, and S Loss denote the corresponding sample means and standard deviations, respectively.
A numerical illustration (S3) Kou Model: 1-year / 5-minute σ = 0.4 λ = 1000 p = 0.5, α + = α = 0.1 Method TRV S TRV Loss S Loss B n,k n 0.4039 0.0028 139.6776 12.2193 Pow 0.3767 0.0019 230.0170 15.0308 BF 0.6495 0.0315 375.5850 24.3999 Table: Finite-sample performance of the threshold realized variation (TRV) estimators based on K = 5, 000 sample paths for the Kou model: f ζ (x) = p α + e x/α+ 1 [x 0] + q α e x /α 1 [x<0]. Loss represents the total number of Jump Misclassification Errors, while TRV, Loss, S TRV, and S Loss denote the corresponding sample means and standard deviations, respectively.
Additive Processes 1 The model X s := s 0 γ(u)du + s 0 N s σ(u)dw u + ζ j =: Xs c + J s, where (N s ) s 0 Poiss ({λ(s)} s 0 ), independent of W, and deterministic smooth functions σ, λ : [0, ) R + and γ : [0, ) R with σ and λ bounded away from 0. 2 Optimal Threshold Problem Given a sampling scheme π : t 0 < < t n = T, determine the vector B π, = ( B π, t 1 inf E B=(Bt1,...,B tn ) R m + n = i=1 j=1,..., B π, t n ) that minimizes the problem n i=1 ( 1 [ Xti X ti 1 >B ti,n ti N ti 1 =0] + 1 [ Xti X ti 1 B ti,n ti N ti 1 0] inf {P ( i X > B ti, i N = 0) + P ( i X B ti, i N 0)}, B ti ( i X := X ti X ti 1, i N := N ti N ti 1 ) )
Optimal Threshold Spot Volatility Estimation Notation: h i = t i t i 1 (Mesh), K θ (t) = 1 θ K ( t θ ) (Kernel), θ = Bandwidth Algorithm: For each i {1, 2,..., n}, set σ 2 0(t i ) := l j= l 1 h i+j i+j X 2 K θ (t i t i+j ) and B,0 t i := [ 3 σ 2 0(t i )h i ln(1/h i ) ] 1/2 while there exists i {1, 2,..., m} such that σ k 1 2 (t i) > σ k 2(t i) do σ k+1 2 (t i) l 1 j= l h i+j i+j X 2 1 [ ] K i+j X B,k θ (t i t i+j ) and t i+j B,k+1 t i [ 3 σ k+1 2 (t i)h i ln(1/h i ) ] 1/2 end while Let k (π) := inf { k 1 : σ k+1 2 (t i) = σ k 2(t i); for all i = 1, 2,..., n } and take σ k 2 (t i) as the final estimate for σ(t m i ) and the corresponding B,k m t i estimate for Bt i. as an The previous algorithm generates a non-increasing sequence of estimators { σ 2 k (t i)} k,i and finish in finite time.
Numerical Illustration (A) Initial Estimates 0.0 0.2 0.4 0.6 0.8 1.0 Actual Spot Volatility Est. Spot Vol. (Uniform) Est. Spot Vol. (Quad) Sample Increments 0.0 0.2 0.4 0.6 0.8 1.0 Time Horizon Figure: Spot Volatility Estimation using Adaptive Kernel Weighted Realized Volatility.
Numerical Illustration (B) Intermediate Estimates 0.0 0.2 0.4 0.6 Actual Spot Volatility Est. Spot Vol. (Uniform) Est. Spot Vol. (Quad) Sample Increments 0.0 0.2 0.4 0.6 0.8 1.0 Time Horizon Figure: Spot Volatility Estimation using Adaptive Kernel Weighted Realized Volatility.
Numerical Illustration (C) Terminal Estimates 0.0 0.2 0.4 0.6 Actual Spot Volatility Est. Spot Vol. (Uniform) Est. Spot Vol. (Quad) Sample Increments 0.0 0.2 0.4 0.6 0.8 1.0 Time Horizon Figure: Spot Volatility Estimation using Adaptive Kernel Weighted Realized Volatility.
Numerical Illustration 0.0 0.2 0.4 0.6 0.8 1.0 0.2 0.3 0.4 0.5 0.6 (D) Estimation Variability Time Horizon Actual Spot Volatility Figure: Spot Volatility Estimation using Adaptive Kernel Weighted Realized Volatility. 50 simulations
Conclusions 1 Introduce an objective threshold selection procedure based on statistical optimality reasoning via a well-posed optimization problem. 2 Characterize precisely the infill asymptotic behavior of the optimal threshold sequence. 3 Proposed an iterative algorithm to find the optimal threshold sequence. 4 Extend the approach to more general stochastic models, which allows time-varying volatility and jump intensity.
For Further Reading I Figueroa-López & Nisen. Optimally Thresholded Realized Power Variations for Lévy Jump Diffusion Models To appear in Stochastic Processes and their Applications, 2013. Available at www.stat.purdue.edu/ figueroa. Figueroa-López & Nisen. Optimality properties of thresholded multi power variation estimators. In preparation, 2013.