On the Forecasting of Realized Volatility and Covariance - A multivariate analysis on high-frequency data 1

1 On the Forecasting of Realized Volatility and Covariance - A multivariate analysis on high-frequency data 1 Daniel Djupsjöbacka Market Maker / Researcher daniel.djupsjobacka@er-grp.com Ronnie Söderman, PhD Head of Market Making Research ronnie.soderman@er-grp.com Estlander & Rönnlund Financial Products Ltd Nedre Torget 1A, 65100 Vasa, Finland 06-3180600 Abstract The aim of this study is to forecast one-day volatility and covariance on a number of German equities. We will use three different approaches; a Newey-West corrected Vector Auto Regression, a Vector Error Correction and a multivariate GARCH. The observed volatility is estimated from tick-data, where short term squared returns are summed up in order to get a one-day realized volatility. [Results to be included] 1 Please note that this is merely a first draft and by no means a complete article.

2 1 Introduction For overall management and decision making concerning individual financial assets and portfolios, the joint distributional characteristics of asset returns are important, and it is probably beyond any doubts that its second moment structure is in this context of primary concern. Consequently, Finance Literature has in recent years been flooded with articles and models dealing with the attributes of volatility. However, volatility is unobservable and it can only be examined by fitting econometric models such as GARCH-type models, by calculating historical volatility, or by estimating implied volatilities through option pricing models. As noted in Andersen et al. (2000), the apparent existing competition of parametric volatility models (for example GARCH vs. stochastic volatility) and the presence of volatility smiles/smirks/skews in volatilities implied by the B-S model, suggest the presence of model mis-specification. Further, as documented in Andersen & Bollerslev (1998), the variance of noise is relatively high in relation to the signal when relying on expost examined volatilities through squared returns. To remedy these problems, Andersen et al. (2000, 2001), continuing in the footsteps of earlier work by for example French et al. (1987), Hsieh (1991) and Schwert (1989, 1990), construct a new measure for volatility denoted realized volatility by summing intra-day squared returns. Intuitively, finding a correct measure of volatility simultaneously enables a more efficient estimation of covariance between asset volatilities. This implication is important when considering for example risk management applications such as Valueat-Risk approaches, or when setting up trading strategies based on individual asset volatilities and predicting changes in these. However, there are also a few drawbacks when using high-frequency data. It is well known that non-synchronous trading may induce negative autocorrelation in return series. These problems should be minor when examining longer horizons but may cause problems when employing highfrequency data. Another aspect that has a number of important implications for financial economics and risk management is the predictability of volatility. It is well documented that although asset returns may not be predictable, return volatility is in fact by all means predictable. As noted in Andersen et al. (2001), it is evident that

3 models aimed at forecasting daily variances do not accommodate intra-day information and that intra-day models generally fail to capture longer horizon movements, which has resulted in the fact that it still is standard practice to use forecasts of daily volatility from daily return observations. However, for an option market maker it is common to hedge his positions on an intra-day basis. It is therefore important to measure intra-day volatility on a daily basis. Andersen et al. (2001) also finds that models built directly for the realized volatility, produce forecasts superior to those obtained from more indirect methods. Hence, the purpose of this article is to forecast one-day realized volatility and covariance on a number of German equities by using three different approaches; a Newey-West corrected Vector Auto Regression, a Vector Error Correction and a multivariate GARCH. These approaches are discussed in further detail below. The rest of the paper is structured as follows. The methodology is reviewed in Chapter 2, the data in Chapter 3 and the empirical findings in Chapter 4. The study is consequently summarized in Chapter 5. 2 Methodology and Model Specification Within this chapter the estimation of realized volatility and the three different approaches to forecasting it are discussed in more detail. 2.1 Estimating realized volatility In order to forecast volatility, a suitable estimate for this parameter is needed. As mentioned above, this study will employ realized volatility, which is estimated by summing up short-term squared returns. As outlined in Andersen et al. (2000), we assume a well-defined price vector P(t) evolving over the time [0, T] where 0 t T. 1 The return process does not allow for arbitrage and has a finite instantaneous mean. The n-dimensional logarithmic price-process p(t) may then be decomposed into three different components, so that p(t) = p(0)+ A(t) + M(t), (1)

4 where p(0) is the logarithmic price at time 0, A(t) is a finite variation and predictable component and M(t) is a local martingale. The price process above is general and allows for all specifications used in standard asset pricing theory; it does not require a Markov property and it includes Geometrical Brownian, jump- and mixed jump processes. The continuously compounded return over the period [t - h, t] is denoted by r(t, h) = p(t) - p(t - h). (2) The cumulative return process is then given by r(t) = r(t, t) = p(t) - p(0) = A(t) + M(t). (3) The return process may therefore be decomposed into the mean component A(t) and the local martingale M(t). Since the return process is a semi-martingale it has an associated quadratic variation process. Now, when letting h go to zero, it can be shown that the quadratic variation process measures the realized sample-path variation of the squared return processes. Measures obtained from high-frequency data are thus referred to as realized volatility and by using high-frequency samples the measure should be independent of any model that determines the price process. The sampling frequency utilized in this paper is on a tick-data basis, which yields the desired volatility and covariance estimates directly from the price process as stipulated above. This means that for observations recorded m times per day, for the period t = h, 2h,, T, we get 2 var k, h ( t; m) = = rk,( m) ( t h + ( i / m)), (4a) i 1,..., mh cov kj, h ( t; m) = = rk,( m ( t h + ( i / m)) rj,( m) ( t h + ( i / m)). (4b) i 1,..., mh ) In line with the notations of Andersen et al. (2000), we call the observed measures in (4) the time-t realized h-period volatility and covariance, respectively. As the distribution of the realized volatility clearly is right skewed, the distribution of the logarithmic realized volatilities is expected to be Gaussian. Thus, this study will use logarithmic volatilities.

5 2.2 Forecasting volatility In order to forecast realized volatility, we will use three different approaches; a Newey-West corrected Vector Auto Regression (VAR), a Vector Error Correction (VEC) and a multivariate GARCH. [The specifications of the models are documented in the final version] Please note that if the case is that we do not find any co-integration between volatilities, the VEC will obviously be left out of the final report. 3 Data and Related Implications As noted in Andersen et al. (1999), precise estimation of diffusion volatility does not require long calendar span of data, but rather, volatility can be estimated arbitrarily well from shorter time periods provided that returns are sampled sufficiently frequently. Here, the data is sampled on a tick basis for every asset included in the study. In this study we have chosen the German market as our source of data, as it is a highly liquid and electronically traded market. The examined assets are; Adidas (ADS), Allianz (ALV), BASF (BAS), BMW (BMW), Bayer (BAY), Commerzbank (CBK), DaimlerChrysler (DCX), Deutsche Bank (DBK), Deutsche Telekom (DTE), E.ON (EOA), HypoVereinsbank (HVM), Lufthansa (LHA), Muenchener Rueckversicherung (MUV), SAP (SAP), Siemens (SIE), Thyssen-Krupp (TKA), and Volkswagen (VOW). Henceforth, the abbreviations shown within brackets will be used when referring to these instruments. The examined time-period is 4.1.1999 29.12.2000, which yields a total of 505 days. At this stage, there are a few micro-structural aspects that need to be pointed out. For instance, the non-liquid stocks may suffer from bid/ask spreads being too wide creating distortion in the material due to bid/ask bounces and similar frictions. It is well known that non-synchronous trading may induce negative autocorrelation in return series. These problems should be minor when examining longer horizons but may cause problems when employing high-frequency data. To avoid these problems

6 to a certain extent when analysing the non-liquid stocks, the tick-data is transformed into 5-minute average prices for all stocks, and this second set of observations is hence used for the estimations of the study. Since a specification on the covariance-structure between all of the examined seventeen instruments will turn out to have a huge number of parameters, causing a high degree of freedom, we will have to group the instruments. Basically there are two different approaches that can be used for this purpose. First, the instruments can be divided into groups on the basis of business segmentation, or second, divide the instruments by using a statistical variance analysis tool as for instance cluster or principal component analysis. The second approach is a bit dubious, though, since we might easily end up with a good in-sample fit with weak forecasting power. Which methodology that will be used is yet to be decided upon. Figure 1. A graphical overview of the one-day realized volatilities of the included instruments over the examined time-period. 4 Empirical Results The preliminary descriptive statistics regarding the data is presented in Table 1.

7 Table 1 Descriptive statistics The descriptive statistics for the examined one-day realized volatilities of the instruments included in the study. ADS ALV BAS BMW BAY CBK DCX DBK DTE # obs 505 505 505 505 505 505 505 505 505 mean 0.414 0.310 0.337 0.464 0.327 0.337 0.281 0.306 0.446 std 0.153 0.138 0.096 0.233 0.089 0.136 0.097 0.152 0.156 min 0.145 0.146 0.123 0.174 0.139 0.118 0.117 0.126 0.045 low quart 0.311 0.240 0.273 0.348 0.268 0.254 0.215 0.223 0.341 median 0.390 0.281 0.324 0.424 0.313 0.311 0.260 0.269 0.428 upp quart 0.477 0.348 0.379 0.522 0.365 0.387 0.325 0.348 0.532 max 1.433 1.787 0.801 3.937 0.826 1.874 1.122 1.555 1.116 EOA HVM LHA MUV SAP SIE TKA VOW # obs 505 505 505 505 505 505 505 505 mean 0.356 0.411 0.411 0.408 0.469 0.366 0.453 0.336 std 0.137 0.131 0.116 0.151 0.243 0.239 0.137 0.126 min 0.154 0.175 0.162 0.179 0.190 0.145 0.189 0.146 low quart 0.279 0.319 0.321 0.308 0.329 0.254 0.354 0.259 median 0.334 0.386 0.403 0.378 0.422 0.330 0.427 0.312 upp quart 0.404 0.479 0.483 0.477 0.531 0.425 0.519 0.387 max 2.191 0.936 0.888 1.736 3.393 4.594 1.319 1.853 [Main results to be included] 5 Summary The purpose of this article was to forecast one-day realized volatility and covariance on a number of German equities by using three different approaches; a Newey-West corrected Vector Auto Regression, a Vector Error Correction and a multivariate GARCH. The observed volatility is estimated from tick-data, where short term squared returns are summed up in order to get a one-day realized volatility. Results show [To be continued]

8 Endnotes: 1 We assume that the price-process is defined on a complete probability space and is evolving in continuos time. All prices in the time-interval [0, T] are known at time T. The process is nice.

9 References Andersen, T. G., Bollerslev, T. (1998): Answering the Sceptics: Yes, Standard Volatility Models Do Provide Accurate Forecasts, International Economic Review, 39, pp. 885-905 Andersen, T. G., Bollerslev, T., Diebold, F. X. & Labys, P. (1999): Realized Volatility and Correlation, Working Paper, University of Pennsylvania Andersen, T. G., Bollerslev, T., Diebold, F. X. & Labys, P. (2000): The Distribution of Realized Exchange Rate Volatility, Working Paper, University of Pennsylvania Andersen, T. G., Bollerslev, T., Diebold, F. X. & Labys, P. (2001): Modeling and Forecasting Realized Volatility, Working Paper, University of Pennsylvania Black, F. & Scholes, M. (1973): The Pricing of Options and Corporate Liabilities, Journal of Political Economy, 81, pp. 637-659 Bollerslev, T., Engle, R. F. & Nelson D. B. (1994): ARCH Models, in R.F. Engle and D. McFadden (eds.) Handbook of Econometrics, Volume IV, pp. 2959-3038. Amsterdam: North-Holland French, K. R., Schwert, G. W. & Stambaugh, R. F. (1987): Expected Stock Returns and Volatility, Journal of Financial Economics, 19, pp. 3-29 Hsieh, D. A. (1991): Chaos and Non-linear Dynamics: Applications to Financial Markets, Journal of Finance, 46, pp. 1839-1877 Schwert, G. W. (1989): Why Does Stock Market Volatility Change Over Time?, Journal of Finance, 99, pp. 1115-1153 Schwert, G. W. (1990): Stock Market Volatility, Financial Analyst Journal, May- June, pp. 23-34