Modeling dynamic diurnal patterns in high frequency financial data Ryoko Ito 1 Faculty of Economics, Cambridge University Email: ri239@cam.ac.uk Website: www.itoryoko.com This paper: Cambridge Working Papers in Economics CWPE1315 OxMetrics User Conference, Sept 214 Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 1/22
Introduction We want to model daily periodic patterns (diurnal patterns) in high-frequency financial data. Popular ways to capture periodicity: - Fourier flexible form approximation - Compute (re-scaled) sample moments for each intra-day bins The pattern of periodicity is fixed over time. (e.g. Andersen and Bollerslev (1998), Engle and Russell (1998), Shang et al. (21), Campbell and Diebold (25), Engle and Rangel (28), Brownlees et al. (211), Engle and Sokalska (212).) Contribution: dynamic cubic spline to model periodicity (c.f. Harvey and Koopman (1993)). Advantages: - Parsimonious. One-step estimation with all other coefficients. - Dynamic periodicity. - Fits the empirical distribution of our data well (including the upper extreme quantiles). Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 2/22
Introduction Also need to capture other stylized features Concentration of zero-observations Non-normality, heavy tail Highly persistent dynamics (long-memory?) Methods: Distribution decomposition at zero Dynamic Conditional Score (Harvey (213)) to capture non-normality, heavy-tail Unobserved components Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 3/22
Data (short sampling period): empirical features Trade volume of IBM stock traded on the NYSE. The number of shares traded. Period: 5 consecutive trading weeks in February - March 2 Aggregation interval: 3 seconds (15 seconds 1 minute also in the paper) 4, 4, 35, 3,5 3, 3, 25, 2,5 2, 2, 15, 1,5 1, 1, 5, 5 Mon 2 Mar Tue 21 Mar Wed 22 Mar Thu 23 Mar Fri 24 Mar Mon 2 Mar Tue 21 Mar Wed 22 Mar Thu 23 Mar Fri 24 Mar Figure: IBM3s (left column) and the same series smoothed by the simple moving average (right column). Time on the x-axis. Monday 2 Friday 24 March 2. Each day covers trading hours between 9.3am-4pm (in the New York local time). Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 4/22
Empirical features (short sampling period) Diurnal U-shaped patterns. Trade volume bottoms out at around 1pm. 1, 9, 8, 7, 6, 5, 4, 3, 2, 1, 9 11 13 15 2, 1,8 1,6 1,4 1,2 1, 8 6 4 2 9 11 13 15 Figure: IBM3s (left column) and the same series smoothed by the simple moving average (right column). Time on the x-axis. Wednesday 22 March 2, covering 9.3am-4pm (in the New York local time). Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 5/22
Empirical features Sample autocorrelation. Highly persistent. Heavy, long upper-tail. 1.5 Sample Autocorrelation of y.3.25.2.15.1.5.5 95% confidence interval.1 2 4 6 8 1 12 14 16 18 2 Lag.5 1 1.5 2 2.5 3 3.5 4 All volume x 1 4 3 Figure: Sample autocorrelation of 2 IBM3s. Sampling period: 28 1 February - 31 March 2. The.5 1 1.5 2 2.5 3 3.5 4 All volume x 1 2th lag corresponds approximately 4 to 1.5 hours prior. Figure: Frequency distribution (top) and empirical cdf (bottom) of IBM3s. Sample: 28 February - 31 Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 6/22 Frequency % Empirical CDF % 1.5 1 9 8 7 6 5 4
Data (long sampling period): empirical features Trade volume of IBM stock traded on the NYSE. The number of shares traded. In-sample period: January 27 - December 21 (4 years) Aggregation interval: 1 minutes 1,, 75, 5, 25, Mon Tue Wed Thu Fri Sample Autocorr. of y 1.8.6.4.2 95% C.I. 5 1 15 2 Lag Figure: Left: IBM1m between Mon 7 Jan - Fri 11 Jan 28. Each day covers trading hours between 9.3am-4pm (in the New York local time). Right: autocorrelation of IBM1m, Jan 27 - Dec 21. Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 7/22
The model Spline-DCS model. y t,τ = ε t,τ exp(λ t,τ ), ε t,τ F t,τ 1 i.i.d. F (ε; θ) λ t,τ = δ + s t,τ + µ t,τ + η t,τ s t,τ : periodic component capturing diurnal patterns µ t,τ : low-frequency nonstationary component. µ t,τ = µ t,τ 1 + κ µ u t,τ 1 η t,τ : stationary component. A mixture of AR to capture behavior similar to long-memory. η t,τ = η (1) t,τ + η (2) t,τ, η (1) t,τ = φ (1) 1 η(1) t,τ 1 + φ(1) 2 η(1) η (2) t,τ = φ (2) 1 η(2) t,τ 1 + κ(2) η u t,τ 1 t,τ 2 + κ(1) η u t,τ 1 u t,τ : the score of distribution of y t,τ (i.e. f y (y t,τ )/ λ t,τ ). DCS = dynamic conditional score [Harvey (213) and Creal, Koopman, and Lucas (211, 213)] Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 8/22
Dynamic cubic spline s t,τ : dynamic cubic spline (Harvey and Koopman (1993)) k s t,τ = 1l {τ [τj 1,τ j ]} z j (τ) γ j=1 Fixed v.s. dynamic spline: let γ γ t,τ where γ t,τ = γ t,τ 1 + κ u t,τ 1 γ, s γ 1 k = 3 γ, s γ 1 k = 3 γ 3 γ 3 γ 2 γ 2 Monday Tuesday Friday τ, t Monday Tuesday Friday τ, t Figure: Fixed spline (left) and dynamic spline (right). Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 9/22
Dynamic cubic spline (ctd) s t,τ : dynamic cubic spline (Harvey and Koopman (1993)) k: number of knots s t,τ = k 1l {τ [τj 1,τ j ]} z j (τ) γ j=1 τ < τ 1 < < τ k : coordinates of the knots along time-axis γ = (γ 1,..., γ k ) : y-coordinates (height) of the knots z j : [τ j 1, τ j ] k R k : k-dimensional vector of weighting functions. Conveys information about (i) polynomial order, (ii) continuity, (iii) length of periodicity, and (iv) zero-sum conditions. Bowsher and Meeks (28): special type of dynamic factor model Time-varying spline: let γ γ t,τ where γ t,τ = γ t,τ 1 + κ u t,τ 1 Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 1/22
Why use this dynamic spline? Alternative options used by many: Fourier representation Sample moments for each intra-day bins Diurnal pattern = deterministic function of intra-day time (Andersen and Bollerslev (1998), Engle and Russell (1998), Shang et al. (21), Campbell and Diebold (25), Engle and Rangel (28), Brownlees et al. (211), Engle and Sokalska (212).) So why use this spline? Allows for changing diurnal patterns (can improve upper quantile fit) No need for a two-step procedure to diurnally adjust data Allow for the day-of-the-week effect via changes in shape of diurnal patterns as well as level shift. Unlike the alternative: seasonal dummies. Test for level differences. Used by many (e.g. Andersen and Bollerslev (1998), Lo and Wang (21)) Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 11/22
Estimation results (short sample period) Core assumption: ε t,τ = y t,τ / exp( λ t,τ ) has to be free of autocorrelation. Satisfied - no autocorrelation in ε t,τ. Sample Autocorrelation of y.3.25.2.15.1.5.5 95% confidence interval Sample Autocorrelation of res.3.25.2.15.1.5.5 95% confidence interval.1 2 4 6 8 1 12 14 16 18 2 Lag.1 2 4 6 8 1 12 14 16 18 2 Lag Figure: IBM3s: sample autocorrelation of trade volume (top), of ε t,τ (left). The 95% confidence interval is computed at ±2 standard errors. Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 12/22
Estimation results (short sample period) F GB2 distribution fits very well. (GB2 = generalized beta distribution of the second kind.) PIT: F ( ε t,τ ) U[,1]. Fit seems to be the best when our spline is time-varying. 1 Empirical CDF 1 Empirical CDF Empirical CDF.9.8.7.6.5.4.3.2.1 res > Burr with estimated parameters Empirical CDF of F*(res>).9.8.7.6.5.4.3.2.1 1 2 3 4 5 6 7 8 9 1 res >, Burr.1.2.3.4.5.6.7.8.9 1 F*(res>) Figure: Empirical cdf of ε t,τ > against cdf of GB2 ( ν, ζ, ξ) (left). Empirical cdf of the PIT of ε t,τ > computed under F ( ; θ) GB2 ( ν, ζ, ξ) (right). Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 13/22
Compare with log-normal distribution Empirical frequency 4 35 3 25 2 15 1 Log-normal distribution popular. Often used in literature. (e.g. Alizadeh, Brandt, Diebold (22)) 5 But log-normal inferior to GB2. PIT of ε t,τ far from U[,1]. Why? 4 3 2 1 1 2 3 4 5 log(ibm3s>) Empirical quantiles of IBM3s> 5 4 3 2 1 1 2 3 QQ Plot of Sample Data versus Standard Normal 4 5 4 3 2 1 1 2 3 4 5 Theoretical N(,1) quantiles Figure: Log(trade volume): The frequency distribution (left) and the QQ-plot (right). Using non-zero observations re-centered around mean and standardized by one standard deviation. Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 14/22
Estimated coefficients IBM3s IBM1m IBM3s IBM1m κ µ.6 (.1).7 (.2) γ ;1, 1.273 (.16) 1.195 (.98) φ (1) 1.557 (.136).377 (.93) γ 1;1,.78 (.58).75 (.57) φ (1) 2.41 (.135).567 (.96) γ 2;1, -.469 (.7) -.45 (.69) κ (1) η.49 (.7).45 (.8) γ 3;1, -.227 (.47) -.244 (.47) φ (2) 1.688 (.41).621 (.57) ω 9.146 (.174) 9.752 (.155) κ (2) η.92 (.8).69 (.8) ν 1.631 (.16) 2.23 (.33) κ.3 (.2).3 (.2) ζ 1.486 (.45) 1.142 (.44) κ 1.1 (.1). (.1) p.47 (.5).6 (.3) κ 2 -.2 (.1) -.2 (.1) κ 3. (.1). (.1) Parametric assumptions, identifiability requirements satisfied. η t,τ stationary. p is consistent with sample statistics. Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 15/22
Estimation results (long sample period): fit of distribution ε t,τ GB2 fits very well. PIT: F ( ε t,τ ) U[,1]. 1 Empirical CDF 1 Empirical CDF Empirical CDF.9.8.7.6.5.4.3.2.1 res > GB2 with estimated parameters 1 2 3 4 5 6 res >, GB2 Empirical CDF of F*(res > ).9.8.7.6.5.4.3.2.1.1.2.3.4.5.6.7.8.9 1 F*(res > ) Figure: Empirical cdf of ε t,τ > against cdf of GB2( ν, ζ, ξ) (left). Empirical cdf of the PIT of ε t,τ > computed under F ( ; θ) = GB2( ν, ζ, ξ) (right). Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 16/22
Estimation results (long sample period): check autocorrelation in û t,τ Should have û t,τ i.i.d. [and Beta distributed.] s t,τ is Fourier (left) and our fixed spline (right). The number of coefficients in s t,τ are the same. Sample Autocorr. of u.3.2.1 95% C.I..1 5 1 15 2 Lag Sample Autocorr. of u.3.2.1 95% C.I..1 5 1 15 2 Lag Figure: IBM1m: sample autocorrelation of û t,τ. The periodic component s t,τ is Fourier (left) and our fixed spline (right). The 95% confidence interval is computed at ±2 standard errors. Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 17/22
Estimated dynamic cubic spline, ŝ t,τ (long sample period) ŝ t,τ : dynamic cubic spline. Reflects diurnal patterns that evolve over time. 3.5 3 2.5 Spline 2 1.5 1.5 4 2 Trade time (9.3 at ) 1 2 Date (Jan 27 at ) Figure: IBM1m: exp(ŝ t,τ ). Sampling period is Jan 27 - Dec 21. Trading time between 9.3am-4pm. Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 18/22
on ue ed hu Fri on ue ed hu Fri Mon Tue Wed Thu Fri Mon Tue Wed Thu Fri Mon Tue Wed Thu Fri Mon Tue Wed Thu Fri 9:3 1:3 11:3 12:3 13:3 14:3 15:3 Estimated dynamic cubic spline, ŝ t,τ Reflects diurnal patterns that evolve over time. 2. 1.5 6-1 Mar 13-17 Mar 2-24 Mar 27-31 Mar 1.5 1. Tuesday 14 March 1..5..5. -.5 -.5-1. -1. 2. 1.5 6-1 Mar 13-17 Mar 2-24 Mar 27-31 of 1.5 a typical day, Tuesday 14 March, from 1. market open to close (right). Figure: ŝ t,τ of Model 2 for IBM3s. Over 6-31 March 2 (left). ŝ t,τ Time along the x-axes. 1..5.5. Day-of-the-week. effect? Do we need dynamic periodicity? -.5 -.5 Average for March 2-1. -1. Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 19/22
Out-of-sample performance (long sample period) One-step ahead forecasts, ε t,τ = y t,τ /exp( λ t,τ ) for 5 days without re-estimating parameters. Forecast horizon: January - March 211. Empirical CDF of F*(res > ) 1.8.6.4.2 5 days ahead.2.4.6.8 1 F*(res > ) Quantile of GB2 8 7 6 5 4 3 2 1 Fixed Spline Dynamic Spline 95 99.9% quant. 2 4 6 8 Quantile of (res f > ) Figure: Left: PIT of forecast ε t,τ, Dynamic Spline. Right: QQ-plot of forecast ε t,τ, Dynamic Spline (blue) and Fixed Spline (red). Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 2/22
Out-of-sample performance Our model and parameter estimates are stable One-step ahead density forecasts (without re-estimation): very good for at least 2 days ahead. Multi-step ahead density forecasts: very good (i.e. PIT approx. iid U[,1]) for one complete trading-day ahead (equivalent of 78 steps for IBM3s). More discussions in the paper. Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 21/22
Intuition, future direction Dynamic spline can reflect changes in the pattern of morning trading activity (i.e. how we standardize large-sized morning observations). Important feature when the amount (or nature) of overnight news can change morning trading patterns. Still to do: etc. Further investigate the performance of the model at the upper-tail. Multi-variate version: price and volume. Model for higher-frequency: 1 second? Application to panel data (using composite likelihood?) Asymptotic properties of MLE when DCS non-stationary. Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 22/22