On Optimal Sample-Frequency and Model-Averaging Selection when Predicting Realized Volatility

On Optimal Sample-Frequency and Model-Averaging Selection when Predicting Realized Volatility Joakim Gartmark* Abstract Predicting volatility of financial assets based on realized volatility has grown popular in the literature due to its strong prediction power. Theoretically, realized volatility has the advantage of being free from measurement error since it accounts for intraday variation that occurs on high frequencies in financial assets. However, in practice, as samplefrequency increases, market microstructure noise might be absorbed and as a result lead to inaccurate predictions. Furthermore, predicting realized volatility based on single models cause predictions to suffer from model uncertainty, which might lead to understatements of the risk in the forecasting process and as a result cause poor predictions. Based on mentioned issues, this paper investigates which sample frequency that minimizes forecast error, 1-, 5- or 10-min, and which model-averaging process that should be used to deal with model uncertainty, Mean forecast combinations, Bayesian model-averaging or Dynamic model-averaging. The results suggest that a 1-min samplefrequency minimize forecast errors and that Bayesian model-averaging performs better than Dynamic model-averaging on 1-day and 1-week horizons, while Dynamic modelaveraging performs slightly better on 2-weeks horizon. Keywords Realized Volatility, Market Microstructure Noise, Sample-Frequency, Model Uncertainty, Bayesian Model-Averaging, Dynamic Model-Averaging and Forecasting Department of Economics Master Thesis, 30 credits Economics Master of Science, Economics 120 credits Spring Term 2017 Supervisor: Annika Alexius *I would like to send my sincerest gratitude to Björn Hagströmer at Stockholm Business School, who helped with the data used in this study

Table of Contents 1. Introduction... 2 2. Theoretical Background... 5 2.1 Portfolio Optimization... 5 2.2 The Process of the Stock Price... 6 2.3 Measuring the Volatility of the Stock Price... 7 2.4 Market Microstructure Noise... 9 2.5 Model Uncertainty... 10 3. Previous Research... 11 3.1 Realized Volatility... 11 3.2 Sample-Frequency and Market Microstructure Noise... 12 3.3 Model-Averaging... 14 4. Econometric Methodology... 15 4.1 The Forecasting Process... 16 4.2 Realized volatility... 17 4.3 HAR-Models... 18 4.4 GARCH-Models... 20 4.5 Loss Functions... 23 4.6 Model-Averaging... 24 5. Data... 26 5.1 Daily Realized Volatility... 26 5.2 Weekly Realized Volatility... 28 5.3 Two Weeks Realized Volatility... 29 6. Results... 31 6.1 Optimal Frequency... 31 6.2 Model-Averaging... 34 7. Conclusions... 44 Bibliography... 46 Appendix... 48 1

1. Introduction The failure of assessing and anticipating the financial risk on the credit markets was one of the major reasons for the financial crisis in 2008. In order to avoid a crisis of similar magnitude again, it is vital to ensure that financial volatility is modeled and predicted efficiently. Economists are interested in predicting financial volatility for several reasons. First, expected future volatility determines how an investor should balance his portfolio of risky and risk-free assets in order to minimize risk subject to expected return. Second, it helps policy makers to anticipate potential financial crises and counteract by adjust their policies accordingly. Third, it is an important determinant in asset pricing in the sense that large risk should be compensated by larger return. Fourth, it has a huge influence on the pricing process of financial derivatives and helps investors to hedge risks on the market. Historically, financial volatility has been measured and forecasted either through daily standard deviation based on daily returns or through parametric models such as GARCH and stochastic volatility models. Even though daily standard deviation has the advantage of being observed it still absorbs a lot of noise since the measure only contains of one observation in each trading day. Furthermore, using GARCH or stochastic volatility models require assumptions regarding the volatility s distribution, while the volatility itself is never actually observed. Thus, Andersen et al. (2001, b) proposed a new way to measure risk, referred to as realized volatility, in which high-frequency data consisting of, for example, 1-min, 5-min or 10-min prices of the financial assets are used. By transforming the intraday prices into intraday return, the realized volatility of each day is then calculated by taking the sum of all squared intraday returns in one trading day. Compared to daily standard deviation and parametric models, realized volatility has the advantage of providing a model-free measurement that allows observing more of the volatility that occurs during one trading day. Furthermore, empirical evidence has found that realized volatility significantly improves forecast and portfolio performance (Andersen, et al., 2003) & (Fleming, et al., 2003). However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure noise and as a result bias the estimator. Market microstructure noise is all the variation of the stock price when observed on high frequencies that is not related to the true volatility. Examples of such variation are bid-ask bounce effects, rounding errors due to price discreteness and recording errors. In theory, the higher frequency used to model volatility, the more market 2

microstructure noise might also be absorbed. However, meanwhile, as more observations are included, the efficiency of the estimators is improved. Previous research has concluded that selecting the frequency that minimizes forecast error has economic value in terms of portfolio performance. However, one single optimal frequency that consistently minimizes forecast errors has not yet been established (Bandi & Russel, 2006) & (Potter, et al., 2008). A second issue that has been investigated in the context of predicting financial volatility is how to deal with model uncertainty. Model uncertainty means that a single model might yield poor predictions during certain periods even though it on average performs well. In order to deal with this issue, previous research has combined several models to forecast future volatility, further on referred to as modelaveraging. Generally there are three different approaches when modelaveraging realized volatility; Mean forecast combination, Bayesian modelaveraging and Dynamic model-averaging. Mean forecast combination weighs all models equally regardless of each model s forecast performance and has been advocated due to its simplicity and since performing at least as good as those approaches based on forecast performance (Smith & Wallis, 2009). Bayesian model-averaging selects the weight of each model based on the average past forecast performance of each model, while dynamic model-averaging selects the weight of each model based only on the last observed forecast performance. Thus if models forecast performance of realized volatility are time-varying, dynamic model-averaging should outperform Bayesian model-averaging. Previous research has concluded that regardless of approach, forecasting realized volatility based on model-averaging add economic value in terms of portfolio and forecast performance (Wang, Ma, Wei, & Wu, 2016). However despite previous effort, research is still divided regarding which modelaveraging approach that yields best forecasts of realized volatility (Wang & Nishiyama, 2015) & (Liu, et al., 2017). Thus, based on mentioned gaps in the literature, the purpose of this paper is to identify which sample-frequency (1) and model-averaging approach (2) that minimize forecast error of realized volatility. (1) denotes this paper s first purpose, in which 1-, 5- and 10-min frequencies are examined. (2) denotes this paper s second purpose, in which Mean forecast combination, Bayesian and Dynamic model-averaging are examined. To the best of the author s knowledge in the context of forecasting realized volatility, previous research has not yet evaluated optimal frequency based on forecast performance of several models nor has it investigated differences in forecast 3

performance when the model-averaging process is restricted to only include the historically best performing models. Thus, findings of this paper will be first of its kinds and provide new insight regarding optimal sample-frequency and differences in forecast performance of Mean forecast combination, Bayesian and Dynamic model-averaging when forecasting realized volatility. Data is based on OMX30, which is an index representing the most traded stocks on Nasdaq Stockholm stock exchange. The Swedish stock exchange has been selected with respect to its maturity and well working capital markets. Since previous research mostly has been conducted on the U.S stock exchange this study does not only contribute by filling gaps in the existing literature, but it also complements previous findings by providing results based on another dataset. The data reflects the period between 2007 and 2012, a period characterized with high levels of volatility on the financial markets. This paper will use some of the new proposed ways to model and forecast realized volatility in order to keep the results as up-to-date as possible. One of the models that has arisen in spirit of the realized measures is the heterogeneous autoregressive (HAR) model, originally suggested by Corsi (2009). Due to its simplicity and strong forecast performance, the HAR model has been frequently used to predict future volatility. The HAR model follows a simple autoregressive structure, in which the last day s, week s and month s volatility are modeled to predict future volatility. Based on this model, new models accounting to this family have been developed, in which jumps and leverage effects are included. Other authors have focused on the impact of volatility on realized volatility by applying different models from the general autoregressive conditional heteroskedasticity (GARCH) family and found strong evidence that this improves forecast performance (Barndorff-Nielsen & Shephard, 2005) & (Corsi, et al., 2008). Thus, based on 1-min, 5-min and 10-min data, single and combined models of the HAR and GARCH family have been applied to forecast realized volatility 1-day, 1-week and 2-weeks ahead. The structure of this paper is divided into five further sections. Section 2 gives the theoretical background related to this paper s purpose. Section 3 highlights what previous research has found within this area. Section 4 explains the econometric methodology to model realized volatility. Section 5 presents the data used in this investigation in more detail. Section 6 presents the results of this paper literally and in tables. Finally, section 7 gives conclusions and proposals to future research. 4

2. Theoretical Background This paper aims to identify which frequency and model-averaging approach that minimize forecast error of realized volatility. This section is divided into five subsections, which aim to explain the theoretical background related to this paper s purpose. The first section explains how forecasting volatility accurately adds economic value in terms of portfolio performance. However, the first section is not directly related to the purpose of this paper, but serves more as an eye-opener to why it is important to care about predicting realized volatility. The second section describes the nature of a stock s return, which in its essence is what determines the volatility of the stock, which is further explained in the third section. The fourth and fifth section concerns the parts directly related to this paper s purpose, in which the theory behind market microstructure noise and model-uncertainty is explained, respectively. 2.1 Portfolio Optimization Previous research has established that predicting volatility accurately adds economic value in terms of portfolio selection in the sense that it helps investors to make better investment decisions 1. The framework in this section is not applied in this paper, but is described in order to make a standpoint for why predicting realized volatility is important according to the existing literature. In portfolio theory it is generally known that an investor selects the portfolio in which the return is maximized subject to the risk. To do this the investor estimates the future risk on the market in order to ensure that the portfolio is weighted according to her risk preferences. Fleming et al. (2001) suggest a mean-variance framework that considers an investor with a short-time horizon, 1-day, 1-week and 1-month, who aims to minimize variance subject to a certain level of expected return. Furthermore, the theory assumes a constant expected return based on, among others, Merton (1980), who proved that it is hard to expect variation in expected returns in the short horizon. Thus, the only thing changing in the short horizon is the predicted variance and consequently the investor follows a portfolio strategy based on volatility timing. To illustrate this approach consider numbers of risky assets where denotes the return of each asset in a vector then the expected return matrix is given by and the variance-covariance matrix is given by. Now the investor aims to minimize the risk of his portfolio subject 1 See for instance (Fleming, Kirkby, & Ostdiek, 2003). A more thoroughly discussion of previous findings in terms of how predicting volatility adds economic value is given in section 3. 5

to his expected target return,, and select the weight,, of each asset accordingly. This optimization problem is illustrated below: (1.1) Subject to ( ), (1.2) where denotes the expected return of the risk-free asset. Solving this problem for then yields: ( ) ( ) ( ) ( ) (1.3) Now since is assumed to be constant, the weight of each asset is determined based on the predicted values of the variance-covariance matrix, shown below: t 2 1 21 n 1 12 2 2 n 2 1n 2n 2 n (1.4) Where denotes the predicted variance of risky asset and denotes the covariance between risky asset and. The investor predicts the variance and covariance of each risky asset, shown in matrix (1.4), and then weighs each asset according to equation (1.3). Fleming et al. (2003) substitute the standard variance and covariance measures with realized variance and covariance and finds improvement in portfolio performance. Thus, realized measures based on intraday observations might help investors to make more accurate investment decisions since forecasts of risks become more accurate. This paper focus on the realized volatility, in which both realized variance and covariance can be calculated from (if assuming constant correlation). Next two sections give a more detailed explanation of realized volatility and why, in theory, it is a good proxy for financial risk. 2.2 The Process of the Stock Price Since volatility basically is variation in the stock price, it is important to understand what drives this variation. This section explains the process of the stock price and how it is modeled according to the existing literature. There are many ways to describe the nature of the stock price, but common for most models is that volatility plays an important part. In cases when change over time for a variable is uncertain it is common to consider the change in the 6

variable to follow a stochastic process. This is the general underlying assumption in empirical asset pricing when investigating the nature of the stock price. Based on the literature of Hull (2012), it is generally assumed that the price process contains of a continous-price and continous-time stochastic process. A continous-price stochastic process means that that the stock price can take any value within a certain range and a continous-time stochastic process means that the change of the stock price might occur at any time. However, in reality, time and prices are generally observed discretely rather than continous. Thus, these assumptions are usually relaxed when forecasting prices of financial assets (Corsio & Renó, 2012). The most basic model assumes that changes in the log-price of a stock,, follow a standard Brownian motion that includes the stock s expected return,, in period and its volatility,, times the wiener process,. Since this paper uses intraday daily return to predict future volatility, it is important to add the market microstructure noise that arises from the level of frequency on intraday data,. The market microstructure noise on high frequency data could be such things as typographical errors or delayed quotes to mention some 2. Also unpredicted announcement effects causing volatility to change drastically are important to consider, generally referred to as jumps,. In order to consider jumps and market microstructure noise, the following equation is used to explain variation in stock prices 3 : ( ) ( ) ( ) ( ) ( ) ( ), (2.1) where ( ) is the expected rate of return, ( ) is referred to as the integrated volatility of the stock price, ( ) is the Wiener process, is the mean zero random noise independent of the Wiener process arising due to market microstructure noise and ( ) reflects the stochastic jump process. 2.3 Measuring the Volatility of the Stock Price Since this paper forecasts realized volatility rather than daily standard deviation, it is important to understand why research has moved towards this measurement. This section explains the concept and nature of volatility, the measurement error arising when daily standard deviation is used and how realized volatility deals with this issue. 2 A more detailed explanation of market microstructure noise is given in section 2.4 3 The equation is a combination of Zhou (1996) and (Barndorff-Nielsen & Shephard, 2004a) to illustrate the impact of jumps and microstructure noise separately 7

According to Hull (2012), stock volatility can be thought of as a measure of uncertainty about the returns provided by the stock. Generally one can say that volatility reflects the variation of a stock s price. Variation occurs due to new information conceived by the market 4. Investors consider these news and reevaluate the price and as a result movements occur in the stock price. In general, larger expected volatility requires a larger expected return in order to compensate for the risk. Daily standard deviation based on the square root of the squared return s deviation from its past mean has historically been used as a proxy for financial risk due to its simplicity. However, this measurement has been considered inefficient since potentially suffering from measurement errors. This is because daily standard deviation only consists of one single observation in each trading day, usually the closing price, and as a result absorbs a lot of noise since not observing the intraday volatility (Andersen & Bollerslev, 1998). To demonstrate this issue consider the integrated volatility, ( ), in trading day as shown in equation (2.1). In order to model this variable as accurate as possible one seeks to model the cumulative quadratic variation of all small periods in trading day, in other words the intraday movements, referred to as period. The integrated volatility from equation (2.1) in trading day can then be expressed as the square root of the integral of all squared movements in period such that 5 : ( ) ( ) ( ) ( ), (3.1) where ( ) is the variance process in period, denotes jumps in period, denotes the market microstructure noise in period and is the total number of intraday movements in trading day. Thus if variation is large in trading day, daily standard deviation based only on one observation is theoretically a weak proxy of the actual variation occurring in one trading day. This is basically one of the main reasons for why using intraday return to predict volatility has been growing popular in the volatility forecasting literature. The realized volatility requires intraday observations of the stock price, for example on 1-min, 5-min or a 10-min frequency and can be expressed as follows: ( ), (3.2) 4 See chapter 17 (Elton, Gruber, Browm, & Goetzmann, 2013) for a discussion regarding efficient markets and the efficient market hypothesis (EMH) 5 The integrated volatility equation is inspired by & (Corsi, 2009) & (Andersen, Bollerslev, & Diebold, 2007) 8

where denotes all the intraday observations in trading day,, and grows larger as the sample-frequency increases, denotes each intraday observation and is the intraday log-return of the stock. Thus equation (3.2) is convenient in the sense that variation is based on intraday returns and as a result absorbs more of the actual variation occurring in trading day. Previous papers have established that as sample frequency in equation (3.2) increases, it converges to follow the quadratic variation process expressed in equation (3.1) such that 6 : ( ) ( ) ( ) ( ) ( ) (3.3) The equations shows that as intraday observations, sample frequency goes very close to zero), realized volatility,, goes to infinity (i.e the, becomes an efficient and unbiased estimator of the integrated volatility in period. As will be discussed more in section 3, recent research has moved more towards measuring volatility in this way due to its efficiency and since, theoretically, being free from measurement error. However, when using realized volatility as a proxy of volatility another bias arises due to the increased proportion of absorbed market microstructure noise. This is further explained in section 2.4. 2.4 Market Microstructure Noise This paper investigates two research questions, in which the first one concerns the bias that arises on high frequencies due to market microstructure noise. This section explains how market microstructure noise is defined in this paper and the trade-off that occurs when increasing the sample frequency. Black (1986) distinguish between the meaning of market microstructure noise in the context of finance, econometrics and macroeconomic. Further, he explains that the only thing market microstructure noise has in common for all these contexts is that it refers to something that the model observe, but that is not related to the causality the model tries to explain. This paper concerns the market microstructure noise that arise in financial data, which Black (1986) describes as information concerning the movement of the stock price that is not actual information. Though this definition might seem a bit confusing, it makes sense in the context of realized volatility. This is because when observing stockprices on high frequencies there is an increased chance that a proportion of the observed prices suffers from errors such as bid-ask bounce effects, rounding errors due to price discreteness and recording errors. None of these errors are 6 For a discussion of the convergence see (Andersen & Bollerslev, 1998) & (Barndorff-Nielsen & Shephard, 2002, a) 9

related to the true movement of the stock price, but since observed it is still used as information to explain the movements. As a result of this, estimates are based on information that is not actual information, which provide noisy estimations and might lead to inaccurate predictions (Hansen & Lunden, 2006). Thus in this paper, market microstructure noise is defined as observed movements of the intraday stock price that has no explanatory power of the true volatility. In theory, increasing the sample frequency implies an increased risk of observing more market microstructure noise and as a result one might estimate the microstructure volatility rather than the realized volatility (Awartani, et al., 2009). However, the impact of the market microstructure noise depends on the specifics of the market in the sense that if the market contains high proportions of market microstructure noise relative to true variation, then realized volatility should be measured on a lower frequency since a high frequency would absorb too much noise (Andersen, et al., 2011). However, if the proportion of market microstructure noise is small relative to the observed variation, a high frequency is preferred due to the increased efficiency in the estimates. Thus in order to maintain reliable predictions it is important to choose the right sample- frequency. 2.5 Model Uncertainty The second purpose of this paper concerns how to deal with model uncertainty through different methods of model-averaging. For this reason it is important to understand the concept of model uncertainty and how model-averaging deals with this issue in the forecasting process. This section will give some theoretical background regarding model uncertainty and explain how model-averaging might deal with this issue. Model uncertainty is a potential issue when predicting a variable s future values based on a single model. This can be illustrated with a simple example. Consider two different models used for predicting the future value of a variable. The basics of forecasting theory tells you to pick the model that produces smallest forecast error. However, this approach ignores the uncertainty of the single model. An early paper that considers this issue points out two factors that arise based on this approach (Bates & Granger, 1969): 1. Each model is based on information that the other model has not considered 2. Each model interpret the relation between the independent and the dependent variable differently 10

The second one is not necessarily an issue in the sense that if the estimated relation in one of the models is wrong, it is better to only use the correct model. However, the issue of model uncertainty in terms of forecasting arises due to that future performance of a single model is uncertain and since being selected based on its average performance, it is possible that there might be periods that other models perform better. For this reason, model-averaging has been considered a cure for model uncertainty. Model-averaging combines several models in order to adjust for model uncertainty. In macroeconomics, modelaveraging grew popular in the beginning of the 2000 th -century to forecast inflation and real output growth, while the application of model-averaging in the financial economics literature is still relatively rare and it is not until recent years it has been growing in popularity in this context as well (Rapach, et al., 2010). Thus this is a relevant subject to further investigate in order to facilitate prediction of financial variables based on combined models. 3. Previous Research This section is divided into three subsections, in which previous research regarding realized volatility, frequency and market microstructure noise and model-averaging are highlighted. These are all related to the purpose of this paper in the sense that realized volatility is being forecasted and the optimal sample-frequency and model-averaging approach are being investigated. Since most main findings in this research subject have been conducted on U.S data, this section will present these findings. However, it should be highlighted that this study is based on data from the Swedish stock exchange, which is less liquid than the U.S stock exchange, but still very similar in terms of maturity stage. 3.1 Realized Volatility Since realized volatility is being forecasted in this paper, it is vital to understand why this measure is important and what previous papers has found regarding this measurement. This section gives insight in previous papers findings regarding realized volatility and its empirical support to forecast future volatility compared to previous more conventional approaches. In one of the first papers investigating realized volatility, Andersen et al. (2001, a) examined the distribution of realized exchange rate volatility based on 5-min frequency during 1986-1996. Their paper found strong evidence supporting that realized volatility provides a variable free from measurement error and might be more accurate than those based on parametric estimates of the error term of daily returns. In their second paper the same year, the authors did a 11

similar investigation, based on 5-min data during 1993-1998, of stocks included in Dow Jones Industrial Average (DJIA) (Andersen, et al.,2001, b). Also on financial data, the authors found that realized volatility provides a modelfree measurement. Furthermore, the authors argued that realized volatility should be preferred to parametric models, in which volatility is never actually observed, since it allows observing actual volatility and is at least as accurate as parametric models. Somewhat biased in the sense that some of the main findings in this field of study is practically based on the same authors over several years, but in their third paper concerning realized volatility, Andersen et al. (2003) compare prediction performance of realized exchange rate volatility and daily exchange rate volatility based on GARCH components. Their findings suggest that basic AR models based on realized volatility outperform the forecasts of those that only observe the daily exchange rate and then through parametric models predict volatility. In another paper, McMillan & Speight (2004) highlight the poor evidence in favor for that GARCH models can provide better forecasts than a simple autoregressive model when predicting future volatility based on daily exchange rate or stock return. The authors suggest that this is due to all the noise being observed when based only on one daily observation. Thus by using intraday 30-min data for 17 different exchange rates to predict realized volatility and apply different models from the GARCH family, the authors find strong evidence suggesting that GARCH models perform better than AR models when realized volatility is used. The authors then use this as an argument to support the hypothesis that predicting volatility based on daily return to measure volatility suffers from measurement error. In another famous paper, Corsi (2009) proposed the Heterogeneous Autoregressive model of the realized volatility (HAR-RV). The model is very straight forward and basically aggregate daily realized volatility into past day, past week and past month in an OLS regression. His result suggests that despite the model s simplicity it could still outperform the GARCH-model and as a result the HAR-model, along with its extensions, has been used frequently ever since. As pointed out above, the empirical evidence for realized volatility is supportive and as a result research in this subject has become popular as well. This section has highlighted some of the most important findings in this area. 3.2 Sample-Frequency and Market Microstructure Noise Despite previous effort, an optimal sample-frequency has not yet been established, which is the objective of this paper s first purpose. However, as this section will discuss further, there is strong evidence supporting that choosing the sample-frequency that minimize forecast error has significant economic 12

value. Thus, it is vital to do further investigations in this area in order to provide new insight of the sample-frequency s impact on realized volatility forecasts. As mentioned in section 2.4 when choosing sample frequency, a trade-off occurs. Picking a high sample frequency might reduce the stochastic error of the measurement, also referred to as increasing the efficiency of the coefficients, but it might also introduce more market microstructure noise and as a result provide biased estimators (Corsi, 2009). Bandi & Russel (2006) use data from stocks included in S&P100 in an attempt to establish the optimal frequency. The authors findings suggest that the optimal frequency for realized variance varies from 0.4 min to 13.8 min. The authors also conclude that there is significant economic gain from chosing the right frequency when applying different frequencies on the framework proposed by Fleming (2003). However, a range between 0.4 minutes and 13.8 minutes is not very useful when choosing between 1-min, 5-min and 10-min frequencies, especially if the performances of these frequencies are significantly different from each other. Another paper concerning frequencies was exercised by Potter et al. (2008), who use a similar framework to investigate the performance when predicting the realized covariance matrix of S&P100 from 1- min to 130-min frequency. Similar to Bandi & Russel (2006), Potter et al. (2008) conclude that choosing the right frequency has economic value. However, they also suggest that the optimal frequency ranges between 30 and 65 minutes, which is a significantly lower frequency than proposed by Bandi & Russel (2006). However, the results of Potter et al. (2008) are not as convincing as the paper might imply. First, their results also imply a strong performance on 10-min frequency, but the authors do not consider this as important since all other results are in favor of lower frequencies. Second, the authors only use one model to evaluate the performance of each frequency and do not consider the model uncertainty that might arise in this context. This issue also holds for Bandi & Russel (2006), who investigate performance depending on frequency based on different stocks, but only on one model. For this reasons it is vital to statistically test several single models performances on each frequency and then test performances of each model when based on different frequencies in order to be more statistically certain about the optimal frequency. Despite previous attempts 7 and the economic value it might add to portfolio optimization, research has not yet been able to establish a golden rule when 7 See also (Bandi & Russel, 2008), (McAleer & Medeiros, 2008) and (Shin & Hwang, 2015) for further discussions concerning market microstructure when predicting realized volatility. 13

selecting sample-frequency. Instead it has been common to use an ad hoc rule, in which a sample-frequency between 5 and 30 minutes are selected (Shin & Hwang, 2015). This paper aims to fill this gap by investigating the performance of several single models on higher frequencies, 1-min, 5-min and 10-min, in an attempt to see if there is a significant difference in forecast performance depending on frequency. 3.3 Model-Averaging This paper s second purpose is to investigate differences in Mean forecast combinations, Bayesian and Dynamic model-averaging when forecasting realized volatility with a larger focus on the two latter. This section will give insight in what previous research has established in this subject. Model-averaging in the context of predicting realized volatility has grown popular in the last decade and is still a topic that many papers consider the economic value of. As pointed out in section 2.5, model-averaging combines several models in order to deal with model uncertainty, which might arise when using one single model to predict the future value of a variable. Ignoring model uncertainty might lead to understatements of the risk in the forecasting and as a result cause poor predictions (Hibon & Evgeniou, 2004). One of the first papers testing model-averaging to predict future realized variance was Liu & Maheu (2009), who used a Bayesian model-averaging (BMA) approach, which combines several models based on each model s average historical predictive power. The authors use 5-min S&P data during 1997 and 2004 and apply linear HAR and AR models to forecast 1-day, 1-week and 2-weeks ahead and find that BMA significantly improves the prediction performance compared to single models on all horizons. For these findings the authors give two reasons. First, there is not one single model that dominates in performance across markets and horizons. Second, giving models more weight during periods when performing well contributes to a decreased uncertainty and consequently provides more reliable predictions. Another study predicts volatility on stock indices on Chinese and Japanese markets using three new models specifically invented for high-frequency data. This study confirms that prediction performance improves when using BMA compared to single models (Wang & Nishiyama, 2015). However, Wang et al. (2016) argue that BMA does not consider structural breaks or the fact that models forecast performance are time-varying and therefore suggest a dynamic model-averaging (DMA) approach. A DMA approach as the name implies is dynamic in the sense that the model selection is very flexible. Due to its flexibility it allows parameters to be 14

time-varying. Wang et al. (2016) use 5-min data from the S&P 500 index during 1996-2013 and apply eight different models all derived from the HAR family. The results show that DMA on average performs better than single models but not significantly better than BMA. Furthermore, the authors run a portfolio exercise using a similar framework as the one presented in section 2.1 and finds that both BMA and DMA improve portfolio performance. In another recent paper, Liu et al. (2017) compare performance of BMA and DMA when forecasting the realized range volatility on S&P and crude oil based on models from the HAR family. The authors find that the DMA approach is significantly better than BMA and individual models to forecast future volatility. However, their model-averaging only consist of five single models from the same family implying that the combined models will probably follow a somewhat similar pattern. Thus by adding models from two different families and expanding models included in the model-averaging process, as exercised in this paper, might contribute to more heterogeneity in the forecasts and as a result provide different findings concerning performance of BMA and DMA. Furthermore, previous research has not been very concerned about the impact of horizons when model-averaging. Since time-variation of models forecast performance might depend on the horizon it also makes sense to consider this when investigating model-averaging approach Summing up, previous research has concluded that model-averaging add economic value in terms of portfolio and prediction accuracy. However, previous research has not yet considered the magnitude of restricting the number of models while averaging and how this is related to the performance between BMA and DMA nor has it considered how BMA and DMA might depend on horizon. Thus, it might be useful to see how performance changes when restricting model-averaging to only include a limited amount of models based on their performance. Furthermore, investigating the difference on each horizon is of interest in order to see if this should be considered when selecting between BMA and DMA. This paper fills the gap concerning if number of models included in the model-averaging process is essential and if BMA and DMA performance depends on the horizon. 4. Econometric Methodology This section starts by explaining this study s forecast procedure. The following subsection gives an explanation regarding how realized volatility is specified in this paper. Subsections 3 and 4 illustrate and explain the models used to forecast 15

realized volatility from the HAR and GARCH family, respectively. Subsection 5 describes the loss functions used to measure the forecast error of each model. In the final section, the different selection approaches used for model-averaging is illustrated and explained. 4.1 The Forecasting Process As mentioned in previous sections this paper aims to identify the optimal frequency and differences in model-averaging approaches when predicting realized volatility based on the OMX30 index during 2007-2012. Thus findings of this paper will be based on the Swedish stock exchange rather than the U.S stock exchange. However, since previous research has focused on the U.S market, this paper, except for providing new findings, also complement previous findings since based on a new dataset, the Swedish stock exchange. As a first step, rolling window out-of-sample forecasts for 10 different models on three horizons and on three different frequencies are executed. When running a rolling out-of-sample forecast it is possible to choose between rolling recursively or with a fixed window. In a recursive approach one adds new observations after each rolling forecast, in other words the length of the sample expands after each executed forecast. The primary problem with this approach is that it is unfair to compare the observed forecast within the sample in the sense that they are based on different lengths of the sample. For this reason a rolling window is preferred since it allows the length of the sample to be equal after each forecast and consequently the forecast observations are more comparable. The three horizons that have been forecasted are 1-day, 1-week and 2-weeks. These horizons have been chosen based on the mean-variance framework explained in section 2.1 with an investor changing his portfolio daily, weekly or monthly based on the predicted volatility. For the 1-day horizon, the sample window has been set to 300 observations, which is approximately one year and two months. The length has been chosen with respect to that volatility is dynamic and changes quickly and for this reason using volatility based on past observations that exceeds more than one and a half year is irrelevant when predicting volatility 1-day ahead. For the 1-week and 2-weeks horizon, however, the window has been set to 100 observations, which is the minimum amount of observations when running GARCH models in R using the rugarch package. However since the 1-week and 2-weeks horizons are based on aggregated daily realized volatility of 1-week and 2-weeks respectively, the window of the 1- week horizon is almost two years and the window of the 2-weeks horizon is almost 4 years. For the 1-day and 1-week horizons, 100 forecast observations 16

are obtained and for the 2-weeks horizon 50 forecast observations are obtained and thus for all forecast samples it is possible to assume a normal distribution 8. All single model forecasts are executed on 1-min, 5-min and 10-min frequency data. In order to test for frequency performance, OLS-tests are then executed based on the loss function of each model s performance on each horizon. In total each frequency has 30 different model forecast observations, 10 single models on three horizons and thus are assumed to be large and normally distributed as. Based on the established optimal frequency, model-averaging is further investigated. As explained further in section 4.6, three kinds of model-averaging selections are examined to see if these on average outperform single models. The Bayesian and Dynamic model-averaging approach is then further investigated to see if prediction performance is improved when applying restrictions that only include the best performing models in the averaging process. Based on these results, each model restriction that on average performs best on each horizon and loss function is identified. Finally the Bayesian and Dynamic modelaveraging forecasts are tested against each other to see if any difference in forecasting can be established. Also in this procedure, OLS-tests are executed in all steps to see if any significant difference between models forecast performance can be verified. As a final step two robustness checks have been executed. In the first one, the whole process described above is ran again based on an Allshare index for Stockholm Nasdaq including all listed large, mid and small cap firms on this stock exchange in the same period. In the second one, the first two and half year of the OMXS30 data is dropped in order to see if results change when excluding the financial crisis in 2008. However, since this subsample only consist of 187 trading weeks, only the 1-day and 1-week horizons have been examined. All presented results in this paper are based on Newey-West Heteroskedasticity- Autocorrelation-Consistent standard errors. 4.2 Realized volatility As already shown in equation (3.2), realized volatility in this paper refers to the square root of the sum of intraday squared log-return. This measure is useful since it is, in the absence of market microstructure noise, free from measurement error and as sample-frequency goes to infinity yields an unbiased and efficient proxy of the integrated volatility. Furthermore, it has empirical 8 According to the central limit theorem as goes to infinity is large and approximately normally distributed. As a rule of thumb is approximately large when 17

support in the sense that it performs smaller forecast error than daily standard deviation and those based on parametric assumptions 9. Below is the formula for realized volatility from here on denoted :, (4.3) where is the squared log intraday return in intraday period and trading day and frequency, denotes all observed intraday returns in day. Thus if using 1-min consists of approximately five and ten more observations than a frequency on 5-min and 10-min, respectively. 4.3 HAR-Models Due to the increased interest of modeling volatility based on high-frequency data, several new models have been developed. Among these is the popular HAR model, which in its essence is a basic linear model following an autoregressive structure. F. Corsi (2009) 10 published the Heterogeneous Autoregressive model of the Realized Volatility (HAR-RV), which is convenient due to its simplicity and long-memory. Basically the model assumes that markets are heterogeneous in terms of investors who have different time horizons. He argues that the market can be divided into three kinds of investors that might have an impact on volatility, high-frequency traders with a 1-day horizon, portfolio managers with a 1-week horizon and long-term investors with a horizon on 1-month or more. He proposed the following model to predict stock volatility:, (5.1) where is the intercept, is the lagged daily realized volatility in day, is the lagged weekly realized volatility in week, is the lagged monthly realized volatility in month and is the forecast error. In order to consider the potential threat arising from jumps A stochastic process rising due to announcements or other unpredictable actions that has a significant impact on the stock price, Andersen et al. (2007) suggested two models that include jumps, in which the square root of the logarithmic standardized realized bipower variation is calculated as below: ( ) ( ), (5.3) 9 See section 3.1 for more information regarding previous findings of realized volatility 10 This paper was known and applied already in 2004, however it was not published until 2009 18

where ( ) denotes the expected mean according to a standard normal distributed random variable and ( ) denotes the bipower variation term which is basically the sum of the absolute value of intraday return times the absolute value of the intraday return in the next period ( ). Generally expressed, bipower variation attempts to catch the quadratic variation in the stock return that is not captured by the realized volatility measure. According to Barndorff & Shephard (2004a), the jump component is expressed as follow: ( ), (5.4) where the jump component,, is truncated at zero so that it only consists of nonnegative estimates. Thus by including the jump component in model (5.4), it is possible to run the Heterogeneous Autoregressive model of the realized volatility with jumps (HAR-RV-J):, (5.5) where, and denotes the jump component according to equation (5.4) with 1-day, 1-week and 1-month s lag respectively. Furthermore, Andersen et al. (2007) argued that realized volatility could be decomposed into continuous path (CSP) and jump components (CJ). The authors constructed these components as shown below: ( ) (5.6) ( ) ( ), (5.7) where denotes the indicator function and denotes the critical value identifying the jump according to the standardized normal distributed. Thus in equation (5.6) identifies the significant jumps determined by its critical value and in equation (5.7) is the sum of the residuals not consisting of jumps. This equation is referred to as the Heterogeneous Autoregressive model of the realized volatility with continuous jumps (HAR-RV-CJ), expressed as below:, (5.8) where and are used in the model with 1-day, 1-week and 1-month lag. As a final step for the HAR family models, the leverage effect is considered, which is a general concept in financial markets. Leverage effect implies that 19

negative shocks in returns have a larger impact on volatility than positive shocks. This is basically explained by the increasing default risk that occurs as a result of a decrease in the stock price due to the increased debt relative to equity. In order to account for the leverage effect, Corsio & Renó (2012) proposed the leveraged Heterogeneous Autoregressive model of the realized volatility with jumps (LHAR-RV-J) and continuous jumps (LHAR-RV-CJ). The leverage components can then be modeled in the following way: ( ), (5.9) where, is the aggregated negative return in period based on intraday return in period. Thus, this component is added to model (5.5) and (5.8), respectively: (5.10), (5.11) where equation (5.10) is the LHAR-RV-J model and equation (5.11) is the LHAR-RV-CJ model. 4.4 GARCH-Models Previous research has confirmed that a simple Autoregressive (AR) model based on past realized volatility to predict future realized volatility outperforms stochastic volatility or GARCH models based on daily return (Andersen, et al., 2003). Previous research has also considered the role of volatility on realized volatility and found strong results suggesting that this helps explain some of the variation in realized volatility. For example the results found by Barndorff- Nielsen & Shephard (2005) suggest that realized volatility might suffer from heteroskedastic errors because of time-varying volatility in the realized volatility estimator. Based on these findings, among others, Corsi et al. (2008) investigated this further by including a GARCH component in the HAR-RV model explained in section 4.2. The results indicate that modeling the volatility of realized volatility improves forecast accuracy. Thus, in spirit of previous research this paper will apply a similar strategy and adapt AR models combined with GARCH(1,1) components in order to forecast realized volatility. Introducing the autoregressive conditional heteroskedasticity (ARCH) model, F. Engler (1982) suggested a parametric approach to model the size of the errors 20