Enzo Giacomini Nikolaus Hautsch Vladimir Spokoiny CASE - Center for Applied Statistics and Economics Humboldt-Universität zu Berlin
Motivation 1-2 Ultra-High Frequency Data Ultra-high frequency, Engle (2) or tick-by-tick data: data containing records of all transactions on a stock in an exchange Observable from trades i = 1,..., I in a time interval 1. t i trade time 2. s ti traded price 3. v ti volume (number of shares) traded 4. and others (e.g. sale conditions, broker id)
Motivation 1-3 AIG 2112 1 x 1 4 8 6 4 2 98.5 98 97.5 97 96.5 96 95.5 9 1 11 12 13 14 15 16 17 Figure 1: Time, price and traded volume, AIG - NYSE, 2112
Motivation 1-4 AIG 2112 8 x 1 4 12 6 1 4 2 8 6 2 4 4 6 2 8 1 11 12 13 14 15 16 1 11 12 13 14 15 16 Figure 2: Scaled log returns R i and interval between trades (duration) τ i, i = 1,..., 1597, AIG - NYSE, 2112
Motivation 1-5 Observed data: realizations from marked point process {(t i, Y i )} where Y i = (S ti, V ti ). Associated with it 1. τ i = t i t i 1 interarrival time (duration) ( ) 2. R i = 1 S t τi log i scaled returns St i 1 How to model L(t i, Y i F i 1), L(τ i F i 1) or L(R i F i 1, τ i )?
Motivation 1-6 Conditional Durations Autoregressive conditional duration (ACD) τ i = Ψ θ (X i )ɛ i where Ψ is ARMA-type, ɛ i i.i.d., E[ɛ i ] = 1 and X i F i 1, Engle and Russell (1998). Dierent specications for Ψ θ and ɛ i, e.g. 1. log ACD, Bauwens and Giot (2), threshold ACD, Zhang et al. (21), smooth transition ACD, Meitz and Teräsvirta (26), semiparametric ACD, Hautsch (26). 2. Further: stochastic conditional duration (SCD), Bauwens and Veredas (24)
Motivation 1-7 Dynamic Intensities Autoregressive conditional intensity (ACI), Russell (1999), Bauwens and Hautsch (23) λ i = Φ θ (X i ) where λ i = λ ti is the intensity associated with {t i } 1. Hawkes processes, Bowsher (22), multivariate, Bowsher (26). 2. Further: stochastic conditional intensity (SCI), Bauwens and Hautsch (26)
Motivation 1-8 Durations and Volatilities 1. ACD and UHF-GARCH returns, Engle (2), ACD and multinomial returns, Engle and Russell (25) 2. Filtering volatilities from prices observed at random times, Frey and Runggaldier (2), Frey (21) and Cvitanic, Liptser and Rosovskii (26)
Motivation 1-9 Outline 1. Motivation 2. Trades and Quotes (TAQ) Dataset NYSE 3. Exploratory Data Analysis 4. References
TAQ Dataset 2-1 IBM Trades 95 95 95 95 9 9 9 9 85 85 85 85 Figure 3: Price St i, i = 1,..., 11546 (from the top) for 2.1.21 to 5.1.21 (from the left), IBM - NYSE
TAQ Dataset 2-11 IBM Trades and Quotes 1 98 96 94 92 Figure 4: Bid (blue), ask (green) and trade (red), IBM - NYSE, 2114
TAQ Dataset 2-12 Orders, Spread 1. market order: order to buy/sell volume of stock at best available price 2. limit order (or quote): order to buy (bid) or sell (ask) volume of stock for no more/less than a limit price 3. best ask: lowest price from sell limit orders 4. best bid: highest price from buy limit orders 5. bid-ask spread: dierence between best ask and best bid
TAQ Dataset 2-13 IBM TAQ 2114 99 98 13 14 Figure 5: Bid (blue), ask (green) and trade (red), IBM - NYSE, 2114
TAQ Dataset 2-14 Market Depth 1. market or quote depth: volume associated with best ask/bid 2. market liquidity: market depth. By low liquidity, market orders with high volume may move the price (exhaust depth): trade is more expensive, takes longer time or is not immediately possible 3. order book: queued (non executed) limit orders 4. order book execution according to exchange rules: price-time, xed times, with hidden orders, volume divided across orders, broker matching, specialist
TAQ Dataset 2-15 Market Depth - IBM 1 1 98 96 94 92 Figure 6: Bid volume (blue) and ask volume (green), IBM - NYSE, 2114
TAQ Dataset 2-16 NYSE: Hybrid Market 1 1. trading oor, post for a given stock, specialist, clerks, oor brokers (trading crowd), oor reporters 2. specialist "guarantees orderly and ecient market" 3. orders arrive electronically or placed by brokers 4. specialist access order book, executes orders against each other, his own inventory or orders from the oor 5. specialist decides whether (or not) to execute an order and may stop the market
TAQ Dataset 2-17 NYSE: Hybrid Market 2 1. stopped orders (guaranteed-or-better), crossing orders (broker places buy and sell orders), odd-lots orders (small volume, executed automatically), market-on-close orders (executed at closing price) 2. orders are almost never executed automatically 3. as in 21: clerks report limit orders, oor reporters report trades (timestamp from orders and trades may be delayed)
TAQ Dataset 2-18 g$y..5 1. 1.5 2..5..5 1. 1.5 g$x Figure 7: Kernel density estimator, dierence between prices and bids scales by bid ask spread, NYSE-IBM 29.11.21
Exploratory Data Analysis 3-19 Exploratory Analysis: TAQ - NYSE Trades {(t i, Y i )} 1. t i trade arrival time 2. Y i = (S ti, V ti ) 3. S ti traded price 4. V ti volume (number of shares) traded
Exploratory Data Analysis 3-2 Best Quotes {(q j, K j )} 1. q j best quote arrival time 2. H j = (B qj, A qj ) best quote 3. D j = (V b q j, V a q j ) depth 4. K j = (H j, D j ) 5. B qj best bid at q j with volume V b q j 6. A qj best ask at q j with volume V a q j
Exploratory Data Analysis 3-21 Trades and Best Quotes {(t i, M i )} 1. M i = (Y i, K i ) 2. K i = K j, j = max{j : q j t i } 3. K j : last quote at time t i
Exploratory Data Analysis 3-22 1. Engle and Lunde (23), arrival times from trades and quotes as bivariate point process 2. Engle and Patton (23), model dynamics from quotes H j with VAR
Exploratory Data Analysis 3-23 TAQ data 1 symbol: stock symbol 2 exchange: N for NYSE, T for NASDAQ 3 date: integer identifying date yyyymmdd 4 yyyy: year 5 mm: month 6 dd: day 7 time: cummulative seconds from : at date 8 type: 1 quote / 2 price 9 initiator, just for prices: 1 buy / -1 sell 1 price 11 size: volume associated with price (8) and initiator (7) 12 bid 13 bidsize: volume, in hundreds 14 oer 15 oersize: volume, in hundreds 16 cond: sale condition: conditions of trade - Bunched (B), Cash Sale (C) 17 corr: trade correction indicator - regular trade (), correction (1) 18 tseq: sequential number for NYSE trades 19 g127: indicator of "G", rule 127 and stopped stock trade 2 mode: quote condition 21 qseq: sequential number for NYSE quotes 22 mmid: identies market maker (broker) for NASDAQ quotes 23 caseid: id number 24 lagquotes: time between two consecutive quotes 25 lagtrades: time between two consecutive trades 26 bidaskspread: dierence between bid and ask 27 pdirsplit: split trade indicator (-1 from buyer, unknown and 1 from seller) Table 1: TAQ data description, see TAQ docs for 16 to 22
Exploratory Data Analysis 3-24 Local Parametric Modeling: Trades Estimate β ti, σ 2 t i and λ ti using adaptive weights smoothing (AWS), Polzehl and Spokoiny (2, 24) 1. τ i exp(β ti ) 2. R i = 1 τi log ( S t i St i 1 ) N(, σ 2 t i ) 3. V ti Pois(λ ti )
Exploratory Data Analysis 3-25 95 95 95 95 9 9 9 9 85 85 85 85 1 1 1 1 95 95 95 95 9 9 9 9 85 85 85 85 Figure 8: Prices St i (top) and estimated trend (bottom), from 2.1.21 (left) to 5.1.21 (right), IBM - NYSE. Estimation: AWS, asymmetric with default parameters and hmax = 25
Exploratory Data Analysis 3-26 x 1 3 x 1 3 x 1 3 x 1 3 3 3 3 3 2 2 2 2 1 1 1 1 1 1 1 1 x 1 7 x 1 7 x 1 7 x 1 7 4 4 4 4 3 3 3 3 2 2 2 2 1 1 1 1 Figure 9: Scaled log returns rt i (top) and estimated volatility (bottom), from 2.1.21 (left) to 5.1.21 (right), IBM - NYSE. Estimation: AWS, asymmetric with default parameters and hmax = 25
Exploratory Data Analysis 3-27 12 12 12 12 1 1 1 1 8 8 8 8 6 6 6 6 4 4 4 4 2 2 2 2 15 15 15 15 1 1 1 1 5 5 5 5 Figure 1: Interval between transactions τ i (top) and estimated waiting time (bottom), from 2.1.21 (left) to 5.1.21 (right), IBM - NYSE. Estimation: AWS, asymmetric with default parameters and hmax = 25
Exploratory Data Analysis 3-28 x 1 5 x 1 5 x 1 5 x 1 5 3 3 3 3 2.5 2.5 2.5 2.5 2 2 2 2 1.5 1.5 1.5 1.5 1 1 1 1.5.5.5.5 15 15 15 15 1 1 1 1 5 5 5 5 Figure 11: Transactions volume Vt i (top) and estimated transaction intensity (bottom), from 2.1.21 (left) to 5.1.21 (right), IBM - NYSE. Estimation: AWS, asymmetric with default parameters and hmax = 25
Exploratory Data Analysis 3-29 Market States.12 x 1 6.1 15.8.6 1.4 5.2.5 8 85 9 95 1 15.5 1 1.5 2 x 1 7 x 1 4.4 5.3 4 3.2 2.1 1 5 7 9 11 13 15 2 4 6 8 1 Figure 12: Kernel density estimators based on { µ 2 t i }, { σ 2 t i }, { β 2 t i } and { λ ti }. trend, volatility, waiting time and intensity, clockwise For
Exploratory Data Analysis 3-3 Buy and Sell transactions 1. split {(t i, M i )} into two subsamples containing buyer and seller initiated transactions 2. I i is buy indicator, I i Ber(θ ti ) 3. bid-ask spread, market depths, probability of buy/sell 4. try to identify market states (buy/sell)
Exploratory Data Analysis 3-31 Volatility Waiting time.8.12.16 1e 8 3e 8 5e 8 8 1 14 18 1 11 12 13 14 15 Volume.2.4.6.8 1. 1.2 1 11 12 13 14 15 Market Depth.2.4.6.8 1 11 12 13 14 15 Bernoulli parameter (buy 1)..5.1.15.2 1 11 12 13 14 15 Bid Ask Spread 1 11 12 13 14 15 1 11 12 13 14 15 Figure 13: Black: buy, ask. Red: sell, bid. IBM - NYSE 26.9.21
Exploratory Data Analysis 3-32 1e 8 3e 8 5e 8 7e 8 Volatility 8 18 Waiting time 1 11 12 13 14 15 Volume 1 11 12 13 14 15 Market Depth.2.4.6.8.2.4.6.8 1..2.4.6.8 1 11 12 13 14 15 Bernoulli parameter (buy 1)..5.1.15.2 1 11 12 13 14 15 Bid Ask Spread 1 11 12 13 14 15 1 11 12 13 14 15 Figure 14: Black: buy, ask. Red: sell, bid. IBM - NYSE 3.1.21
Exploratory Data Analysis 3-33 Volatility Waiting time e+ 3e 8 6e 8.16.2.24.28 1 11 12 13 14 15 Volume 8.5 1. 1.5 2. 2.5 1 11 12 13 14 15 Market Depth.2.4.6.8 1 11 12 13 14 15 Bernoulli parameter (buy 1)..5.1.15.2 1 11 12 13 14 15 Bid Ask Spread 1 11 12 13 14 15 1 11 12 13 14 15 Figure 15: Black: buy, ask. Red: sell, bid. IBM - NYSE 29.11.21
Exploratory Data Analysis 3-34 5 1 e+ 1e 4 2e 4 3e 4..5.1.15 5 1 15 2 25 3 Figure 16: Kernel density estimators based on { σ 2 t i } and { β ti } for the year 21. For standard deviation (top) and waiting time, buy (black) and sell (red) transactions. θ estimated daily, IBM - NYSE
References 4-35 References R. Engle and J. Russell Autoregressive Conditional Duration: a New Model for Irregularly Spaced Transaction Data Econometrica, 1998, Vol. 66, 1127-1162. R. Engle The Econometrics of Ultra-High Frequency Data Econometrica, 2, Vol. 68, 1-22.
References 4-36 References R. Engle and J. Russell A Discrete-State Continuous-Time Model of Financial Transactions Prices and Times: the Autoregressive Conditional Multinomial-Autoregressive Conditional Duration Model Journal of Business and Economic Statistics, 25, Vol. 23, 166-179. R. Frey and W. Runggaldier A Nonlinear Filtering Approach to Volatility Estimation with a View Towards High Frequency Data International Journal of Theoretical and Applied Finance, 21, Vol. 4, 199-21.
References 4-37 References R. Frey Risk Minimization with Incomplete Information in a Model for High-Frequency Data Mathematical Finance, 2, Vol. 1, 215-225. J. Cvitanic, R. Liptser and B. Rozovskii A Filtering Approach to Tracking Volatility from Prices Observed at Random Times The Annals of Applied Probability, 26, Vol. 16, 1633-1652.