Financial Econometrics Volatility Gerald P. Dwyer Trinity College, Dublin January 2013 GPD (TCD) Volatility 01/13 1 / 37
Squared log returns for CRSP daily GPD (TCD) Volatility 01/13 2 / 37
Absolute value of log returns for CRSP daily GPD (TCD) Volatility 01/13 3 / 37
Heteroskedasticity over time These graphs suggest heteroskedasticity over time Time-varying volatility of returns Of interest in itself to characterize returns Matters for prices of options and some other financial instruments Volatility clustering These graphs are suggestive but don t tell us too much Using individual observations on squared changes and absolute value to estimate variance and standard deviation as it changes Similar to using each individual observation to estimate mean as it changes Can t forecast anything going forward GPD (TCD) Volatility 01/13 4 / 37
Exponentially weighted moving average Exponentially weighted average assumes today s variance forecast is a weighted average of variance yesterday and variance forecasted for yesterday s 2 t = (1 λ) (r t r) 2 + λs 2 t 1 And yesterday s variance forecast is a weighted average of variance the day before that and variance forecasted for day before that, and so on s 2 t 1 = (1 λ) (r t 1 r) 2 + λs 2 t 2 λ = 0.94 daily frequency has been suggested GPD (TCD) Volatility 01/13 5 / 37
Use shorter-term returns to estimate variance over longer periods Use daily variance in the month to calculate variance for the month Let rt m be the monthly return in month t Let r t,i be the daily return on day i in month t Suppose that daily returns are serially uncorrelated and the daily variance is constant Then rt m = n r t,i i =1 Var [rt m ] = n Var [r t,i ] and Var [r t,i ] = n i=1 (r t,i r t ) 2 n 1 where r t is the mean of the daily returns The estimated monthly variance thus is σ m2 t = n n i =1 (r t,i r t ) 2 n 1 GPD (TCD) Volatility 01/13 6 / 37
Daily variance to estimate monthly variance The estimated monthly variance is simple to calculate σ m2 t = n n i=1 (r t,i r t ) 2 n 1 This becomes more complicated if the daily returns are serially correlated, but it s still manageable If daily log returns have high excess kurtosis and serial correlations, then this estimator may not be consistent Does this make sense from a subject-matter (financial economics) point of view? GPD (TCD) Volatility 01/13 7 / 37
Garman-Klass estimator of daily variance Use high, low, opening, and closing prices to estimate variance Can estimate daily variance just knowing opening, high, low and closing prices Assume that price follows a random walk Let c t be the logarithm of the closing price so r t = c t c t 1 Conventional estimator is σ 2 t = E [(c t c t 1 ) 2] Using only closing price High H t, low L t, and open O t also often are available Can estimate daily variance of price (not log price) from σ 2 GK = 0.12 (O t C t 1 ) 2 f + 0.88 0.5 (H t L t ) 2 + 0.386 (C t O t ) 2 1 f where f is the fraction of the day that the market is closed Minimum variance unbiased estimator for a random walk with no drift GPD (TCD) Volatility 01/13 8 / 37
Yang and Zhang estimator Use high, low, opening, and closing prices to estimate variance of log prices over a longer period Define o t = ln O t ln O t 1 h t = ln H t ln O t 1 l t = ln L t ln O t 1 c t = ln C t ln O t 1 Monthly variance based on n days of trading is σ 2 YZ = σ o + k σ c + (1 k) σ rs where σ o and σ c are is the estimated variances of o t and c t and σ 2 rs = 1 n [h t (h t c t ) + l t (l t c t )] k = 0.34 1.34 + (n + 1) / (n 1) and k was chosen to minimize the variance of the estimator σ 2 YZ GPD (TCD) Volatility 01/13 9 / 37
Serial correlation Change in logarithm of value-weighted CRSP index Serial correlation of squared changes in logarithm of value-weighted CRSP index Serial correlation of absolute values of change in logarithm of value-weighted CRSP index GPD (TCD) Volatility 01/13 10 / 37
Autoregressive conditional heteroskedasticity (ARCH) is intended to deal with this Simple ARCH model for returns r t = µ t + u t E u t = 0, E u 2 t = σ 2 t σ 2 t = α 0 + α 1 u 2 t 1 where r t is a log return, σ 2 t is the variance of a t conditional on past values of the squared innovations, a 2 t 1 r t = µ t + u t is the mean equation for r t σ 2 t = α 0 + α 1 u 2 t 1 is the variance equation for r t u t is the innovation in r t A slightly different version of the same equations is where u t = h t ε t and r t = µ t + h t ε t E ε t = 0, E ε 2 t = 1 h 2 t = α 0 + α 1 u 2 t 1 GPD (TCD) Volatility 01/13 11 / 37
Estimating an ARCH model r t = µ t + u t E u t = 0, E u 2 t = σ 2 t σ 2 t = α 0 + α 1 u 2 t 1 Steps in estimating an ARCH model 1 Estimate a model for the mean equation 2 Use the residuals of the mean equation to test for ARCH effects 3 Specify a variance model with ARCH effects if it seems warranted 4 Check the fitted model and refine as suggested by diagnostic statistics GPD (TCD) Volatility 01/13 12 / 37
Mean equation In general, there is no reason the mean equation can t be as complicated as we like r t = µ t + u t r t can be a complicated ARMA(p,q) or can have variables included r t is stationary in mean r t may be first difference of original series for example r t = p t p t 1 May be mis-specified if ignore conditional heteroskedasticity of u t GPD (TCD) Volatility 01/13 13 / 37
Testing for ARCH Simple model r t = u t E u t = 0, E u 2 t = σ 2 t σ 2 t = α 0 + α 1 u 2 t 1 GPD (TCD) Volatility 01/13 14 / 37
Three tests for ARCH Box-Ljung test applied to squared residuals, u 2 t, for some pre-specified number of lags k Engle test based on a regression for the squared residuals u 2 t = α 0 + α 1 u 2 t 1 + α 2 u 2 t 2 +... + α k u 2 t k + e t (1) where e t is the error term in the regression for squared residuals Test whether α 1 = α 2 =... = α k using T R 2 where T is the number of observations and R 2 is the R 2 in equation (1) which has a χ 2 k distribution Common to use one lag F-test for regression (1) GPD (TCD) Volatility 01/13 15 / 37
Properties of ARCH models Mean Simple model r t = u t = σ t ε t E ε t = 0, E ε 2 t = 1 σ 2 t = α 0 + α 1 u 2 t 1 where α 0 > 0 and α 1 0. Why? Let F t 1 denote the set of all information available in t 1 and earlier, especially r t 1, r t 2,..., u t 1, u t 2,... E [u t ] = E [E (u t F t 1 )] (application of law of iterated expectations) = E [E (σ t ε t F t 1 )] = E [E (σ t F t 1 ) E (ε t F t 1 )] = E [E (σ t F t 1 ) 0] = 0 GPD (TCD) Volatility 01/13 16 / 37
Properties of ARCH models Variance Simple model r t = σ t ε t E ε t = 0, E ε 2 t = 1 σ 2 t = α 0 + α 1 ut 1 2 α 0 > 0, α 1 0 E [u t ] = 0 Var [u t ] = E [ ut 2 ] = E [ E ( ut 2 )] F t 1 = E [ E ( α 0 + α 1 ut 1 F 2 )] t 1 Therefore, if 0 α 1 < 1, = E [ α 0 + α 1 u 2 t 1] = α0 + α 1 E u 2 t 1 = α 0 + α 1 E u 2 t Var [u t ] = α 0 1 α 1 GPD (TCD) Volatility 01/13 17 / 37
Properties of ARCH models Kurtosis fourth moment Simple model Tail behavior r t = σ t ε t E ε t = 0, E ε 2 t = 1 σ 2 t = α 0 + α 1 ut 1 2 α 0 > 0, α 1 0 E [u t ] = 0 Var [u t ] = α 0 1 α 1 Assume ε t is normally distributed Do we get fatter tails than from the normal distribution? GPD (TCD) Volatility 01/13 18 / 37
Fourth moment, tail behavior E u 4 t = 3α 2 0 (1 + α 1) (1 α 1 ) (1 3α 2 1 ) E u 4 t > 0 obviously must hold and therefore α 1 must satisfy ( 1 3α 2 1 ) > 0 and therefore 0 α 2 1 1/3 Already assumed 1 > α 1 The unconditional kurtosis of u t with normally distributed ε t is Therefore, E u 4 t Var [u t ] 2 = 3 1 α2 1 1 3α 2 1 E u 4 t Var [u t ] 2 > 3 This implies fatter tails than for a normal distribution GPD (TCD) Volatility 01/13 19 / 37
Properties of ARCH models Restrictions on estimated variance equation in practice Simple model r t = σ t ε t E ε t = 0, E ε 2 t = 1 σ 2 t = α 0 + α 1 ut 1 2 α 0 > 0, α 1 0 E [u t ] = 0 Var [u t ] = α 0 1 α 1 E u 4 t Var [u t ] 2 = 3 1 α2 1 1 3α 2 1 > 3, 0 α 2 1 1 3 α i need not all be positive when more than one lag in the variance equation Suffi cient to make sure all the estimated conditional volatilities σ 2 t > 0 If one σ 2 t is negative, the estimated results make no sense GPD (TCD) Volatility 01/13 20 / 37
Limitations of ARCH models 1 Symmetric effects of shocks. This is too restrictive for stock returns, where negative shocks have a larger effect on future variance than positive shocks 2 Returns tend to have some clusters of high and low variance, whereas ARCH models tend to predict slow decay to mean from any current variance 3 Restrictive parameterization, e.g. 0 α 2 1 1 3 for kurtosis to be well defined for ARCH(1) 4 Deterministic equation for variance; no error term in σ 2 t = α 0 + α 1 u 2 t 1 5 Provides no evidence on source of changes in variance GPD (TCD) Volatility 01/13 21 / 37
Use û t / σ t to examine whether the serial correlation is adequately estimated Autocorrelation functions of û t / σ t and (û t / σ t ) 2 with Engle regression test on (û t / σ t ) 2 Compare distribution to the one assumed using Kolmogorov-Smirnoff tests GPD These (TCD) tests are asymptotically Volatility correct but have nontrivial estimation 01/13 22 / 37 Estimation of ARCH model 1 Use partial autocorrelation function of u 2 t to determine order of ARCH specified 2 Maximize the likelihood of distribution of ε t Distributions Normal distribution t-distribution with degrees of freedom υ generalized error distribution Quasi-maximum likelihood estimation Consistent estimates of parameters Issue of correct standard errors of coeffi cients 3 u t /σ t is a sequence of IID variables if correctly specified
GARCH ARCH models can require many lags Reduce lags in mean equations by using ARMA models MA terms can substitute for several AR terms Maybe including something like MA terms in ARCH equation can reduce number of lags GARCH (Generalized ARCH) model ARCH Model σ 2 t = α 0 + α 1 u 2 t 1 +... + α k u 2 t m Instead, try GARCH, here a GARCH(m,s) (order of lags often not consistent) σ 2 t = α 0 + α 1 u 2 t 1 +... + α k u 2 t m + β 1 σ 2 t 1 +... + β k σ2 t s Lag lengths are m for the part analogous to the moving average and s for the part analogous to an autoregression May be able to reduce lag length substantially by having both sets of terms GPD (TCD) Volatility 01/13 23 / 37
Properties of GARCH models σ 2 t = α 0 + α 1 u 2 t 1 +... + α k u 2 t m + β 1 σ 2 t 1 +... + β k σ 2 t s Restrictions on parameters max(m,s) α i > 0, β i > 0, i=1 (α i + β i ) < 1 Properties of estimates and relation to parameters E [ ut 2 ] α 0 = 1 max(m,s) i=1 (α i + β i ) For GARCH(1,1), σ 2 t = α 0 + α 1 at 1 2 + β 1 σ2 t 1, with 1 (α 1 + β 1 ) 2 2α 2 1 > 0 E [ ut 4 ] [1 (E [ut 2 ]) 2 = 3 (α 1 + β 1 ) 2] 1 (α 1 + β 1 ) 2 > 3 2α 2 1 Generally speaking, it is hard to estimate more than a few lags GPD (TCD) Volatility 01/13 24 / 37
IGARCH What if the variance is very persistent? σ 2 t = α 0 + α 1 u 2 t 1 +... + α k u 2 t m + β 1 σ 2 t 1 +... + β k σ 2 t s with max(m,s) i=1 (α i + β i ) = 1, indicating a unit root in the variance process Actually pretty common with returns Change in logarithm of value-weighted CRSP index 1/2/1926 to 12/31/2011 dlnvwcrsp = 0.000629+â t σ 2 t = 1.12 10 6 + 0.09998â 2 t 1 + 0.89061 σ2 t 1 standard errors of coeffi cients are 4.91 10 8, 0.0019 and 0.002 sum of coeffi cients is 0.9906 IGARCH model, IGARCH(1,1) u t = σ t ε t σ 2 t = α 0 + β 1 σ 2 t 1 + (1 β 1 ) ut 1 2 0 < β 1 < 1 GPD (TCD) Volatility 01/13 25 / 37
Properties of IGARCH model IGARCH(1,1) u t = σ t ε t σ 2 t = α 0 + β 1 σ 2 t 1 + (1 β 1 ) ut 1 2 0 < β 1 < 1 Unconditional variance is undefined Constant term is similar to constant term for a random walk a trend A nonzero constant term suggests a trend in variance Why? One-step-ahead forecast at h is forecast of σ 2 h+1 Suppose that estimate of σ 2 h and u2 h are available and σ2 h (1) is the forecast made of σ 2 made at h for one step ahead σ 2 h (1) = α 0 + β 1 σ 2 h + (1 β 1 ) u2 h σ 2 h (2) = α 0 + β 1 σ 2 h+1 + (1 β 1 ) u2 h+1 Best forecast of σ 2 h+1 is σ2 h (1) and best forecast of u2 h+1 from u 2 h+1 = σ2 h+1 ε t is σ 2 h (1), so σ 2 h (2) = α 0 + σ 2 h (1) GPD (TCD) Volatility 01/13 26 / 37
GARCH-M GARCH-M is GARCH in mean simple one is r t = µ + cσ 2 t + u t, u t = σ t ε t σ 2 t = α 0 + α 1 u 2 t 1 + β 1 σ 2 t 1 r t is a return on an asset and c is called a risk premium parameter Could use σ t or ln σ t instead of σ 2 t Question: When will the variance of an asset s return reflect its risk? GPD (TCD) Volatility 01/13 27 / 37
Asymmetric GARCH Glosten, Jagannathan and Runkle (1993) Represent the asymmetry in returns that negative shocks create more future variance r t = µ + cσ 2 t + u t, u t = σ t ε t σ 2 t = α 0 + α 1 u 2 t 1 + β 1 σ 2 t 1 + γu 2 t 1I t 1 I t 1 is an indicator with I t 1 = 1 if u t 1 < 0 and I t 1 = 0 if u t 1 0 Greater effect of negative shocks if estimate γ > 0 GPD (TCD) Volatility 01/13 28 / 37
EGARCH Exponential GARCH allows for asymmetry An EGARCH(1,1) with ε t iid and normally distributed r t = µ + u t, u t = σ t ε t { α + (1 β L) ln σ 2 (γ + θ) u } t 1 σ t = t 1 if u t 1 0 α + (γ θ) u t 1 σ t 1 if u t 1 < 0 α = (1 α 1 ) α 0 2πγ where the lag operator L is such that L x t = x t 1 and L i x t = x t i GPD (TCD) Volatility 01/13 29 / 37
EGARCH explained An EGARCH model starts from the function g (ε t ) g (ε t ) = θε t + γ ( ε t E ε t ) E g (ε t ) = 0 which can be rewritten as { } (θ + γ) εt γ E ε g (ε t ) = t if ε t 0 (θ γ) ε t γ E ε t if ε t < 0 If θ is negative, then g (ε t ) is larger for ε t < 0 than for ε t 0 GPD (TCD) Volatility 01/13 30 / 37
EGARCH model Define the lag operator L such that L x t = x t 1 and L i x t = x t i EGARCH model, EGARCH(m,s) r t = µ + u t, u t = σ t ε t g (ε t ) = θε t + γ ( ε t E ε t ) ln σ 2 t = α 0 + 1 + β 1 L +... + β s 1 L s 1 1 α 1 L... α m L m g (ε t 1) GPD (TCD) Volatility 01/13 31 / 37
Test for asymmetric volatility Estimate a mean equation and get residuals û t Define I t 1 = 1 if û t 1 < 0 and I t 1 = 0 if û t 1 0 Run regression û 2 t = φ 0 + φ 1 I t 1 + υ t If φ 1 > 0, then there is an asymmetric effect of negative shocks GPD (TCD) Volatility 01/13 32 / 37
TGARCH Threshold GARCH TGARCH(m,s) σ 2 t = α 0 + s i=1 I t i = 1 if u t i < 0 I t i = 0 if u t i 0 α i, γ i, β j 0 (α i + γ i I t i ) u 2 t i + This also allows for bigger effects of negative shocks m j=1 β j σ 2 t j GPD (TCD) Volatility 01/13 33 / 37
Testing restrictions in nonlinear models Usual t-ratios are estimated for maximum likelihood Inverse of information matrix provides estimator of variance for these tests Likelihood ratio, Wald and Lagrange multiplier tests Consider estimating a parameter θ Maximum likelihood estimate is θ Restricted estimate is θ GPD (TCD) Volatility 01/13 34 / 37
Comparison of likelihood ratio, Wald and Lagrange multiplier tests L( θ ) L( θˆ ) A L( ~ θ ) B θ ~ θˆ θ Vertical distance from B to A is basis of likelihood ratio tests Horizontal distance from B to A is basis of Wald tests (e.g. t-tests) Slope of likelihood function at B is basis of Lagrange multiplier test GPD (TCD) Volatility 01/13 35 / 37
Stochastic Volatility ARCH is restrictive Evolution of variance is deterministic except for influence of innovations in mean equation Stochastic volatility The evolution of volatility is not a deterministic function of only past volatility and innovations to the mean equation Innovations to the variance affect variance independent of mean equation GPD (TCD) Volatility 01/13 36 / 37
Relatively simple example of stochastic volatility r t = u t = σ t ε t ln (σ t ) 2 = α + β ln (σ t 1 ) 2 + σ η η t α > 0, ε t N(0, 1), η t N(0,1) Two innovations for every observation r t How can that be? The innovations reflect different aspects of the series Consider r t = u 1,t + u 2,t u 1,t N(0, σ 2 1 ), u 2,t N(0, σ 2 2 ) Never would be able to tell how much of variance of r t is due to u 1,t and how much is due to u 2,t GPD (TCD) Volatility 01/13 37 / 37