Statistical Inference and Methods

Similar documents
Conditional Heteroscedasticity

Financial Econometrics

Stochastic Volatility (SV) Models

ARCH and GARCH models

Financial Time Series Analysis (FTSA)

Amath 546/Econ 589 Univariate GARCH Models

Chapter 5 Univariate time-series analysis. () Chapter 5 Univariate time-series analysis 1 / 29

Chapter 6 Forecasting Volatility using Stochastic Volatility Model

Lecture 5a: ARCH Models

Thailand Statistician January 2016; 14(1): Contributed paper

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2009, Mr. Ruey S. Tsay. Solutions to Final Exam

Amath 546/Econ 589 Univariate GARCH Models: Advanced Topics

Course information FN3142 Quantitative finance

Financial Econometrics Jeffrey R. Russell. Midterm 2014 Suggested Solutions. TA: B. B. Deng

12. Conditional heteroscedastic models (ARCH) MA6622, Ernesto Mordecki, CityU, HK, 2006.

Discussion Paper No. DP 07/05

Bayesian Estimation of the Markov-Switching GARCH(1,1) Model with Student-t Innovations

U n i ve rs i t y of He idelberg

The Great Moderation Flattens Fat Tails: Disappearing Leptokurtosis

Volatility Clustering of Fine Wine Prices assuming Different Distributions

High-Frequency Data Analysis and Market Microstructure [Tsay (2005), chapter 5]

Risk Management and Time Series

Strategies for High Frequency FX Trading

Linda Allen, Jacob Boudoukh and Anthony Saunders, Understanding Market, Credit and Operational Risk: The Value at Risk Approach

Lecture Note 9 of Bus 41914, Spring Multivariate Volatility Models ChicagoBooth

Posterior Inference. , where should we start? Consider the following computational procedure: 1. draw samples. 2. convert. 3. compute properties

Identifying Long-Run Risks: A Bayesian Mixed-Frequency Approach

Chapter 5 Univariate time-series analysis. () Chapter 5 Univariate time-series analysis 1 / 59

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2012, Mr. Ruey S. Tsay. Solutions to Final Exam

GARCH Models for Inflation Volatility in Oman

Estimation of dynamic term structure models

Vladimir Spokoiny (joint with J.Polzehl) Varying coefficient GARCH versus local constant volatility modeling.

Volatility Analysis of Nepalese Stock Market

Booth School of Business, University of Chicago Business 41202, Spring Quarter 2012, Mr. Ruey S. Tsay. Solutions to Midterm

Some Simple Stochastic Models for Analyzing Investment Guarantees p. 1/36

Booth School of Business, University of Chicago Business 41202, Spring Quarter 2014, Mr. Ruey S. Tsay. Solutions to Midterm

Financial Econometrics Notes. Kevin Sheppard University of Oxford

LONG MEMORY IN VOLATILITY

Statistical Models and Methods for Financial Markets

Financial Econometrics Lecture 5: Modelling Volatility and Correlation

Estimating a Dynamic Oligopolistic Game with Serially Correlated Unobserved Production Costs. SS223B-Empirical IO

Modeling skewness and kurtosis in Stochastic Volatility Models

The test has 13 questions. Answer any four. All questions carry equal (25) marks.

Lecture 5: Univariate Volatility

Modelling financial data with stochastic processes

MEASURING PORTFOLIO RISKS USING CONDITIONAL COPULA-AR-GARCH MODEL

FINANCIAL ECONOMETRICS AND EMPIRICAL FINANCE MODULE 2

Chapter 7: Estimation Sections

Booth School of Business, University of Chicago Business 41202, Spring Quarter 2010, Mr. Ruey S. Tsay. Solutions to Midterm

Lecture Note of Bus 41202, Spring 2008: More Volatility Models. Mr. Ruey Tsay

ARCH Models and Financial Applications

Assicurazioni Generali: An Option Pricing Case with NAGARCH

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2017, Mr. Ruey S. Tsay. Solutions to Final Exam

Absolute Return Volatility. JOHN COTTER* University College Dublin

Calibration of Interest Rates

The Pennsylvania State University The Graduate School BAYESIAN ANALYSIS OF MULTIVARIATE REGIME SWITCHING COVARIANCE MODEL

Properties of the estimated five-factor model

Jaime Frade Dr. Niu Interest rate modeling

Idiosyncratic risk, insurance, and aggregate consumption dynamics: a likelihood perspective

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2011, Mr. Ruey S. Tsay. Solutions to Final Exam.

Research Article The Volatility of the Index of Shanghai Stock Market Research Based on ARCH and Its Extended Forms

An Implementation of Markov Regime Switching GARCH Models in Matlab

Estimation of the Markov-switching GARCH model by a Monte Carlo EM algorithm

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMSN50)

Financial Risk Forecasting Chapter 9 Extreme Value Theory

Empirical Analysis of the US Swap Curve Gough, O., Juneja, J.A., Nowman, K.B. and Van Dellen, S.

1. You are given the following information about a stationary AR(2) model:

Dependence Structure and Extreme Comovements in International Equity and Bond Markets

Technical Appendix: Policy Uncertainty and Aggregate Fluctuations.

Toward A Term Structure of Macroeconomic Risk

Forecasting the Volatility in Financial Assets using Conditional Variance Models

Lecture 8: Markov and Regime

DYNAMIC ECONOMETRIC MODELS Vol. 8 Nicolaus Copernicus University Toruń Mateusz Pipień Cracow University of Economics

A Markov Chain Monte Carlo Approach to Estimate the Risks of Extremely Large Insurance Claims

Analyzing Oil Futures with a Dynamic Nelson-Siegel Model

Analysis of the Bitcoin Exchange Using Particle MCMC Methods

A gentle introduction to the RM 2006 methodology

Statistical Analysis of Data from the Stock Markets. UiO-STK4510 Autumn 2015

Lecture 9: Markov and Regime

ARIMA-GARCH and unobserved component models with. GARCH disturbances: Are their prediction intervals. different?

Financial Risk Management

Model Construction & Forecast Based Portfolio Allocation:

IEOR E4602: Quantitative Risk Management

Financial Econometrics and Volatility Models Stochastic Volatility

A potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in different subsamples

Forecasting Stock Index Futures Price Volatility: Linear vs. Nonlinear Models

On modelling of electricity spot price

List of tables List of boxes List of screenshots Preface to the third edition Acknowledgements

Modelling Stock Market Return Volatility: Evidence from India

GRANULARITY ADJUSTMENT FOR DYNAMIC MULTIPLE FACTOR MODELS : SYSTEMATIC VS UNSYSTEMATIC RISKS

Estimating Historical Volatility via Dynamical System

A New Hybrid Estimation Method for the Generalized Pareto Distribution

A market risk model for asymmetric distributed series of return

Lecture 6: Non Normal Distributions

Portfolio Optimization. Prof. Daniel P. Palomar

Financial Time Series and Their Characterictics

STAT758. Final Project. Time series analysis of daily exchange rate between the British Pound and the. US dollar (GBP/USD)

Short-selling constraints and stock-return volatility: empirical evidence from the German stock market

Introduction to Sequential Monte Carlo Methods

Return Predictability: Dividend Price Ratio versus Expected Returns

Transcription:

Department of Mathematics Imperial College London d.stephens@imperial.ac.uk http://stats.ma.ic.ac.uk/ das01/ 14th February 2006

Part VII Session 7: Volatility Modelling

Session 7: Volatility Modelling 1/ 165 Volatility Modelling ARCH GARCH Stochastic Volatility Multivariate Volatility Methods of Inference

Session 7: Volatility Modelling 2/ 165 It has long been recognized that financial time series exhibit changes in volatility over time that tend to be serially correlated. In particular, financial returns demonstrate volatility clustering, meaning that large changes tend to be followed by large changes and vice versa. A conceptually useful division of these models into observation-driven and parameter-driven models. Observation-driven models allow the variance of the observed series to depend on its lagged values Parameter-driven models specify that the variance of the observations is a function of some unobserved or latent process.

Session 7: Volatility Modelling 3/ 165 The most popular examples of observation-driven models are the Autoregressive Conditional Heteroscedasticity (ARCH) and Generalized ARCH (GARCH) models. In particular, let y t be a realization, at time t, of the time series of interest. Typically, y t is taken to be the compounded return of the underlying asset, so that y t = 100 log (x t /x t 1 ), where x t denotes the price of the asset. ARCH type models specify the distribution of the current observation as a one-step-ahead prediction density.

Session 7: Volatility Modelling 4/ 165 More precisely, for the observation-driven models, we assume y t Ψ t 1 N ( 0, σ 2 t ), where Ψt 1 contains all the information up to time t 1, so that Ψ t = {y t, y t 1,... }. y t = σ t ε t, where {ε t } is a sequence of independent N (0, 1) random variables. The ARCH(p) model allows the conditional variance σ 2 t of y t to be a linear combination of past squared observations, so that σ 2 t = α 0 + p α i yt i. 2 i=1

Session 7: Volatility Modelling 5/ 165 Properties of the ARCH(1) model:the parameters α 0 and α 1 have to be non-negative, and the process is stationary if and only if α 1 < 1, with Var (y t ) = E ( yt 2 ) = α0 / (1 α 1 ). All the odd moments of y t are zero by symmetry, while the fourth moment exists if and only if 3α 2 1 < 1 and is E ( yt 4 ) 3α 2 ( ) = 0 1 α 2 1 (1 α 1 ) 2 ( ). 1 3α 2 1 The implied kurtosis is 3 + E ( yt 4 ) ( ) /E y 2 2 t and is greater than zero and hence y t is leptokurtotic (fat tails).

Session 7: Volatility Modelling 6/ 165 The GARCH(p,q) model: The GARCH(p, q) process is an extension to the ARCH(p) model which models σ 2 t as dependent on its lagged values; p q σ 2 t = α 0 + α i yt i 2 + β i σ 2 t i. i=1 The most widely used GARCH model is that of order (1, 1). i=1 Sufficient conditions for σ 2 t 0 are α i 0, i = 0, 1 and β 1 0. The GARCH(1, 1) process y t is zero mean, second order stationary if and only if α 1 + β 1 < 1, with Var(y t ) = α 0 / (1 α 1 β 1 ) and all the odd moments equal to zero.

Session 7: Volatility Modelling 7/ 165 If in addition, 3α 2 1 + 2α 1 β 1 + β 2 1 < 1 the fourth moment exists and is equal to E(y 4 t ) = 3α 2 0 (1 + α 1 + β 1 ) (1 α 1 β 1 ) ( 1 3α 2 1 2α 1β 1 β 2 ) 1 and y t exhibits leptokurtosis. A special case of the GARCH(1, 1) model has α 1 + β 1 = 1, which is called the Integrated GARCH (IGARCH) model.

Session 7: Volatility Modelling 8/ 165 There exist many other versions of the ARCH type models, Exponential GARCH (EGARCH) ARCH-in-Mean (ARCH-M) TGARCH MGARCH

Session 7: Volatility Modelling 9/ 165 Observation-driven models are built out of one-step-ahead prediction densities. These densities allow the likelihood function to be constructed via the prediction error decomposition. Therefore, the maximum likelihood estimation of the unknown parameters in the model is in principle straightforward. However, there are also a number of drawbacks to ARCH type models. the parameter constraints, imposed so that the conditional variance σ 2 t remains non-negative, are often violated when estimating these coefficients. GARCH models rule out a random oscillatory behavior of the conditional variance process.

The ARCH Model Likelihood Function For ARCH(1) Model GARCH models A Constrained GARCH(1,1) Model The Student-t GARCH(1, 1) Model Session 7: Volatility Modelling 10/ 165 In this section, we will study the likelihood function for the ARCH and GARCH models to illustrate the Bayesian approach for the two univariate GARCH models. The first model is an ordinary GARCH(1,1) and the second model is a Student-t GARCH(1,1). For both models, parameters are α 1, β 1, and (α 1 + β 1 ), which is recognized as a measure of persistence.

The ARCH Model Likelihood Function For ARCH(1) Model GARCH models A Constrained GARCH(1,1) Model The Student-t GARCH(1, 1) Model Session 7: Volatility Modelling 11/ 165 The ARCH(1) process is defined as σ 2 t = α 0 + α 1 y 2 t 1, where α 0 0, α 1 0 are the two parameters about which inference is required. The ARCH(p) process is defined as σ 2 t = α 0 + p α i yt i, 2 where α 0 0, α i 0 are the parameters of the ARCH(p) model. i=1

The ARCH Model Likelihood Function For ARCH(1) Model GARCH models A Constrained GARCH(1,1) Model The Student-t GARCH(1, 1) Model Session 7: Volatility Modelling 12/ 165 Summary: The moments of the ARCH(1) model are given as follows (i) E(Y t ) = E(Y 3 t ) = 0 (ii) The second moment of Y t is E(Y 2 t ) = α 0 (1 α 1 ), for 0 α 1 < 1. (iii) The fourth moment of Y t is ( ) 1 α 2 1 E(Y 4 t ) = 3E(σ 4 t ) = 3α 2 0 (1 α 1 ) 2 ( ), for 0 α 2 1 3α 2 1 < 1 3. 1

The ARCH Model Likelihood Function For ARCH(1) Model GARCH models A Constrained GARCH(1,1) Model The Student-t GARCH(1, 1) Model Session 7: Volatility Modelling 13/ 165 (iv) The kurtosis of Y t is given by κ = 3 1 α2 1 1 3α 2, for α 2 1 < 1 1 3. If α 1 = 0 then κ = 3, and the distribution is Normal. If α 1 > 0 then κ > 3, and the distribution is heavy-tailed. (v) The autocorrelation function (ACF) of Y 2 t ρ Y 2 t (s) = α s 1, where s = 0, 1,.., n for all n 0. is given by The variance characteristics are solely dependent on the nature of the parameter α 1.

The ARCH Model Likelihood Function For ARCH(1) Model GARCH models A Constrained GARCH(1,1) Model The Student-t GARCH(1, 1) Model Session 7: Volatility Modelling 14/ 165 Stationarity and Persistence in ARCH(1) Variance Condition for stationarity: α 1 < 1; sudden changes to the error variance have an impact that decrease at an exponential rate and will eventually diminish in subsequent periods. The conditional variance, σ 2 t, varies over time and is dependent on past squared error terms. The sequence Y t is white noise and Yt 2 is an autoregressive process, hence the existence of volatility clustering is partly controlled by α 1. Note that Yt 2 is not necessarily covariance stationary; its variance will be finite only if 3α 2 1 < 1.

The ARCH Model Likelihood Function For ARCH(1) Model GARCH models A Constrained GARCH(1,1) Model The Student-t GARCH(1, 1) Model Session 7: Volatility Modelling 15/ 165 Persistence: When α 1 > 1, shocks to the variance in period one will have a more than proportionate impact in subsequent periods, Effects in the previous period causing greater shocks in the next, leading to instability in the system. The unconditional variance is not finite, and the conditional variance grows at a more than proportionate rate (dependent on α 1 ) in every subsequent period. The conditional time-varying error variance should always be positive; we may ensure this in the ARCH(1) case by using α 2 0 instead of α 0, and α 2 1 instead of α 1as starting values in the Maximum Likelihood (ML) calculations if these parameters should be negative. Doing so imposes positive parameter values from new ML results.

The ARCH Model Likelihood Function For ARCH(1) Model GARCH models A Constrained GARCH(1,1) Model The Student-t GARCH(1, 1) Model Session 7: Volatility Modelling 16/ 165 The likelihood for ARCH(1) can be written as f (y y 0, α 0, α 1 ) = n ( 1 t=1 2σ 2 t ) 1 2 exp ( y t 2 ) 2σ 2, t where y = (y 1, y 2,..., y n ). Thus the log likelihood is log f (y y 0, α 0, α 1 ) = n log f (y t y t 1, α 0, α 1 ) t=1 = const. 1 2 n t=1 [ log σ 2 t + y t 2 ] σ 2. t

The ARCH Model Likelihood Function For ARCH(1) Model GARCH models A Constrained GARCH(1,1) Model The Student-t GARCH(1, 1) Model Session 7: Volatility Modelling 17/ 165 To obtain the ML estimates we differentiate with respect to α 0, α 1 respectively to obtain the score equations: log f α 0 = 1 2 log f α 1 = 1 2 n ( ) ( σ 2 t 1 y 2 t t=1 t=1 α 0 σ 2 t σ 2 t n ( ) ( σ 2 t 1 y 2 t α 1 σ 2 t σ 2 t ) 1, ) 1, For ARCH(p), (α 0, α 1 ) T becomes (α 0, α 1,.., α p ) T so σ 2 t α 0 = 1, σ 2 t α 1 = y 2 t 1. ( ) σ 2 T t,..., σ2 t = ( 1, y α 0 α t 1, 2..., y 2 T t p). p

The ARCH Model Likelihood Function For ARCH(1) Model GARCH models A Constrained GARCH(1,1) Model The Student-t GARCH(1, 1) Model Session 7: Volatility Modelling 18/ 165 The GARCH(1, 1) model The GARCH(1, 1) process is defined by σ 2 t = α 0 + α 1 y 2 t 1 + β 1 σ 2 t 1, with parameters α 0 0, α 1 0, β 1 0. The GARCH(p, q) process is defined by σ 2 t = α 0 + p α i yt i 2 + i=1 q β j σ 2 t j, j=1 where α 0 0, α i 0, β j 0 are the parameters of the GARCH(p, q) model.

The ARCH Model Likelihood Function For ARCH(1) Model GARCH models A Constrained GARCH(1,1) Model The Student-t GARCH(1, 1) Model Session 7: Volatility Modelling 19/ 165 The moments of the GARCH(1,1) model take the following values: (i) E(Y t ) = E(Y 3 t ) = 0 (ii) If 0 α 1 + β 1 < 1, E(Y 2 t ) = α 0 (1 α 1 β 1 ), (iii) If 0 α 1 + β 1 < 1 and 3α 2 1 + 2α 1β 1 + β 2 1 < 1 E(Y 4 t ) = 3α 2 0 (1 + α 1 + β 1 ) (1 α 1 β 1 ) (1 β 2 1 2α 1 β 1 3α 2 1 ), The fourth moment does not exist when the sum of α 1 + β 1 is close to one, and the value of α 1 is not close to zero.

The ARCH Model Likelihood Function For ARCH(1) Model GARCH models A Constrained GARCH(1,1) Model The Student-t GARCH(1, 1) Model Session 7: Volatility Modelling 20/ 165 (iv) If 3α 2 1 + 2α 1β 1 + β 2 1 < 1, the kurtosis is κ = 3 (1 + α 1 + β 1 ) (1 α 1 β 1 ) ( 1 β 2 1 2α 1 β 1 3α 2 1), When β 1 = 0, this condition is the same as the ARCH(1) 1 model, but when β 1 > 0, α 1 has to be lower than 3. For example, in the typical case where α 1 is not close to zero and β 1 is near to one, κ does not exist.

The ARCH Model Likelihood Function For ARCH(1) Model GARCH models A Constrained GARCH(1,1) Model The Student-t GARCH(1, 1) Model Session 7: Volatility Modelling 21/ 165 (v) The ACF of Y 2 t is given by ρ 1 = α 1(1 β 2 1 α 1 β 1 ) (1 β 2 1 2α 1 β 1 ), ρ s = (α 1 +β 1 )ρ s 1, for s 2 Clearly ρ s depends on the values of α 1 and β 1. The ACF declines geometrically at the rate of α 1 + β 1. If α 1 is sufficiently small and the sum of α 1 + β 1 is close to one, then there exists a slowly decreasing autocorrelation function with finite kurtosis.

The ARCH Model Likelihood Function For ARCH(1) Model GARCH models A Constrained GARCH(1,1) Model The Student-t GARCH(1, 1) Model Session 7: Volatility Modelling 22/ 165 Stationarity and Persistence in GARCH(1,1) Volatility The stationarity of the GARCH(p, q) model is ensured if the coefficients in the conditional variance equation sum to less than one (i.e. α 1 +... + α p + β 1 +... + β q < 1), in which case the unconditional variance of Y t, is a finite constant. α 0 1 (α 1 +... + α p + β 1 +... + β q ), In this case, shocks to the variance term do not have a permanent effect, but fade over time.

The ARCH Model Likelihood Function For ARCH(1) Model GARCH models A Constrained GARCH(1,1) Model The Student-t GARCH(1, 1) Model Session 7: Volatility Modelling 23/ 165 For the GARCH(1, 1) model, the following is known: (i) If α 1 + β 1 < 1, under normality of the residual errors, and Cov(Y t, Y s ) 0. α 0 Var(Y t ) = const. 1 (α 1 + β 1 ) (ii) If α 1 + β 1 1, the ACF will decay quite slowly, indicating a relatively slow change in conditional variance. This has often been observed to occur in practice especially with high frequency data. This indicates that a shock at time t will persist for many future periods.

The ARCH Model Likelihood Function For ARCH(1) Model GARCH models A Constrained GARCH(1,1) Model The Student-t GARCH(1, 1) Model Session 7: Volatility Modelling 24/ 165 (iii) If α 1 + β 1 = 1, then a shock at time t will lead to a permanent change in all future periods; this also refers to the Integrated-GARCH (I GARCH) model, where the conditional variance is non-stationary and the unconditional variance does not exist. (iv) If α 1 + β 1 > 1, then a shock at time t will have a destabilizing effect, not only leading to a permanent change in future periods, but reinforcing itself over time. It is widely thought that the GARCH(1, 1) is broadly an adequate model that has been successfully used in a wide range of volatility modelling situations; it is a simple model, and thus avoids the problems of overfitting, and yet has been found to have the main features present in more complex models.

The ARCH Model Likelihood Function For ARCH(1) Model GARCH models A Constrained GARCH(1,1) Model The Student-t GARCH(1, 1) Model Session 7: Volatility Modelling 25/ 165 Likelihood Function For GARCH(1,1) Model An ML estimation structure can be constructed for all GARCH-type models; it is identical to that for the ARCH model, with the addition of score equations for β log f β 1 = 1 2 n ( ) ( σ 2 t 1 y 2 t t=1 β 1 σ 2 t σ 2 t ) 1, where σ 2 t β 1 = σ 2 t 1 + β 1 σ 2 t 1 β 1.

The ARCH Model Likelihood Function For ARCH(1) Model GARCH models A Constrained GARCH(1,1) Model The Student-t GARCH(1, 1) Model Session 7: Volatility Modelling 26/ 165 To obtain the ML estimates. we need to implement a numerical calculation for the partial derivatives recursively for t = 1,.., n. Unlike ARCH(p), the ML for GARCH(1,1) is more complicated than just implementing the previous procedure due to the recursive term in the score equation for β 1. The resulting estimators have properties of asymptotic normality and consistency. Quasi Maximum Likelihood (QML) estimation may also asymptotic normal distribution for the QML estimates and are in practice close to the ML estimates.

The ARCH Model Likelihood Function For ARCH(1) Model GARCH models A Constrained GARCH(1,1) Model The Student-t GARCH(1, 1) Model Session 7: Volatility Modelling 27/ 165 Bayesian inference For GARCH(1,1) Model The Bayesian posterior distribution is p(α 0, α 1, β 1 Y ) = n ( 1 t=1 2σ 2 t ) 1 2 exp ( y t 2 ) ( ) σ 2 α 1 0 exp (log α 0) 2 t 2σ 2 α 0 α γ 1 1 1 β γ 2 1 1 (1 α 1 β 1 ) γ 3 1 exp { 1 2 n t=1 ( log σ 2 t + y t 2 σ 2 t ) } (log α 0) 2 2σ 2 α 0 α 1 0 α γ 1 1 1 β γ 2 1 1 (1 α 1 β 1 ) γ 3 1.

The ARCH Model Likelihood Function For ARCH(1) Model GARCH models A Constrained GARCH(1,1) Model The Student-t GARCH(1, 1) Model Session 7: Volatility Modelling 28/ 165 Example : FX Returns in a number of twelve Far Eastern and other currencies Daily data Hourly data taken around the time of the market crash in the late nineties. Following results from a Bayesian analysis via Markov chain Monte Carlo (MCMC). In the MCMC algorithm, used 2,500,000, and recorded parameters at every 500 th iteration.

The ARCH Model Likelihood Function For ARCH(1) Model GARCH models A Constrained GARCH(1,1) Model The Student-t GARCH(1, 1) Model Session 7: Volatility Modelling 29/ 165 α 1 Daily JPY GARCH(1,1) β1 α 1 +β 1 0.80 0.85 0.90 0.95 0.04 0.08 0.12 0.16 0.94 0.96 0.98 1.00 0 1000 2000 3000 4000 5000 Runs 0 1000 2000 3000 4000 5000 Runs 0 1000 2000 3000 4000 5000 Runs α 1 β 1 α 1 +β 1 0 200 600 1000 1400 0 500 1000 1500 0 200 400 600 800 1200 0.80 0.85 0.90 0.95 0.05 0.10 0.15 0.94 0.96 0.98 1.00

The ARCH Model Likelihood Function For ARCH(1) Model GARCH models A Constrained GARCH(1,1) Model The Student-t GARCH(1, 1) Model Session 7: Volatility Modelling 30/ 165 GARCH α 1 β 1 (α 1 + β 1 ) (1,1) Mean, Median, Std Mean, Median, Std Mean, Median, Std D-THB 0.9036, 0.9039, 0.0091 0.0961, 0.0957, 0.0091 0.9997, 0.9998, 0.0004 D-SGD 0.0347, 0.0199, 0.0409 0.6248, 0.6403, 0.2144 0.6594, 0.6743, 0.2097 D-JPY 0.9159, 0.9181, 0.0160 0.0714, 0.0701, 0.0138 0.9872, 0.9883, 0.0079 D-HKD 0.0355, 0.0204, 0.0421 0.5823, 0.5804, 0.2208 0.6177, 0.6147, 0.2166 D-GBP 0.0441, 0.0289, 0.0469 0.2045, 0.1925, 0.0740 0.2486, 0.2381, 0.0810 D-CHF 0.2244, 0.1365, 0.2402 0.1127, 0.1130, 0.0419 0.3371, 0.2622, 0.2155 D-CAD 0.9223, 0.9233, 0.0096 0.0707, 0.0702, 0.0095 0.9930, 0.9940, 0.0052 D-AUD 0.0394, 0.0239, 0.0441 0.4568, 0.4224, 0.2047 0.4963, 0.4641, 0.2003 H-THB 0.0269, 0.0174, 0.0284 0.6468, 0.6416, 0.0981 0.7295, 0.6684, 0.0973 H-SGD 0.2945, 0.2932, 0.0363 0.6845, 0.6866, 0.0414 0.9790, 0.9849, 0.0192 H-JPY 0.9193, 0.9200, 0.0103 0.0797, 0.0789, 0.0102 0.9990, 0.9993, 0.0010 H-HKD 0.6618, 0.6621, 0.0350 0.3182, 0.3183, 0.0368 0.9800, 0.9838, 0.0162 Posterior statistics of GARCH(1,1) model for 12 FX series.

The ARCH Model Likelihood Function For ARCH(1) Model GARCH models A Constrained GARCH(1,1) Model The Student-t GARCH(1, 1) Model Session 7: Volatility Modelling 31/ 165 The table above contains posterior summaries for the three parameters in the GARCH(1, 1) model for all FX series. To explore the stability and persistence of GARCH(p, q) model, the sum of the α 1 + β 1 should be examined. From the table, five data series (D-THB, H-JPY, D-CAD, H-JPY, H-SGD) yield values of (α 1 + β 1 ) to be significantly close to one In addition, the estimated values of α 1 are close to one and β 1 are close to zero. Thus there exists considerable persistence in volatility, moving towards non-stationarity.

The ARCH Model Likelihood Function For ARCH(1) Model GARCH models A Constrained GARCH(1,1) Model The Student-t GARCH(1, 1) Model Session 7: Volatility Modelling 32/ 165 We introduced a constrained model to ensure the existence of higher order moments; the kurtosis exists for the observable GARCH(1, 1) process only when the inequality 3α 2 1 + 2α 1 β 1 + β 2 1 < 1 holds; further, the fourth moment only exists for a certain range of values of α 1, β 1. The additional constraint can be explicitly incorporated into the MCMC simulation scheme; we reject points generated by the proposal mechanism that violate the constraint Note that such constraints are typically problematic in conventional (non-simulation based) classical and Bayesian inference.

The ARCH Model Likelihood Function For ARCH(1) Model GARCH models A Constrained GARCH(1,1) Model The Student-t GARCH(1, 1) Model Session 7: Volatility Modelling 33/ 165 Daily JPY GARCH(1,1): Constrained Model α 1 β 1 α 1 +β 1 0.0 0.1 0.2 0.3 0.4 0.5 0.2 0.3 0.4 0.5 0.3 0.4 0.5 0.6 0.7 0.8 0 1000 2000 3000 4000 5000 Runs 0 1000 2000 3000 4000 5000 Runs 0 1000 2000 3000 4000 5000 Runs α 1 β 1 α 1 +β 1 0 500 1000 1500 2000 0 200 400 600 800 0 500 1000 1500 0.0 0.1 0.2 0.3 0.4 0.5 0.2 0.3 0.4 0.5 0.2 0.3 0.4 0.5 0.6 0.7 0.8

The ARCH Model Likelihood Function For ARCH(1) Model GARCH models A Constrained GARCH(1,1) Model The Student-t GARCH(1, 1) Model Session 7: Volatility Modelling 34/ 165 GARCH α 1 β 1 (α 1 + β 1 ) (1,1) Mean, Median, Std Mean, Median, Std Mean, Median, Std D-THB 0.3838, 0.3837, 0.0086 0.4551, 0.4558, 0.0164 0.8389, 0.8394, 0.0079 D-SGD 0.0358, 0.0214, 0.0214 0.6170, 0.622788, 0.6228 0.6528, 0.6582, 0.2078 D-JPY 0.4200, 0.4481, 0.0839 0.2479, 0.2450, 0.0435 0.6679, 0.6878, 0.0715 D-HKD 0.0356, 0.0210, 0.0413 0.5841, 0.5859, 0.2207 0.6197, 0.6182, 0.2140 D-GBP 0.0418, 0.0251, 0.0472 0.2025, 0.1924, 0.0732 0.2443, 0.2344, 0.0820 D-CHF 0.1452, 0.1120, 0.1216 0.1213, 0.1182, 0.0383 0.2665, 0.2431, 0.1111 D-CAD 0.4803, 0.4974, 0.0548 0.1517, 0.1503, 0.0288 0.6319, 0.6444, 0.0498 D-AUD 0.0405, 0.0254, 0.0449 0.4561, 0.4277, 0.1990 0.4966, 0.4721, 0.1950 H-THB 0.2566, 0.2561, 0.0280 0.6586, 0.6603, 0.0450 0.9152, 0.9173, 0.0205 H-SGD 0.0285, 0.0193, 0.0293 0.6480, 0.644049, 0.0994 0.6765, 0.6737, 0.0987 H-JPY 0.4227, 0.4235, 0.0186 0.3708, 0.369196, 0.0378 0.7934, 0.7930, 0.0199 H-HKD 0.4315, 0.4326, 0.0192 0.3484, 0.3475, 0.0380 0.7800, 0.7797, 0.0205 Posterior statistics for the constrained model of GARCH(1,1) for 12 FX series

The ARCH Model Likelihood Function For ARCH(1) Model GARCH models A Constrained GARCH(1,1) Model The Student-t GARCH(1, 1) Model Session 7: Volatility Modelling 35/ 165 Results: The posterior statistics values for this GARCH(1, 1) model are displayed above. No currency estimates the values of (α 1 + β 1 ) to be very close to 1, although the H-THB obtains the highest estimated posterior mean value of 0.9152. We conclude that this constrained model, where the existence of kurtosis is required in the model, produces very different parameter estimates; this may have serious consequences for prediction.

The ARCH Model Likelihood Function For ARCH(1) Model GARCH models A Constrained GARCH(1,1) Model The Student-t GARCH(1, 1) Model Session 7: Volatility Modelling 36/ 165 The Student-t GARCH(1, 1) Model The leptokurtosis of the observed returns series can be modelled explicitly. The Student-t GARCH(1, 1) model can be formulated as Y t = ε t σ 2 t ε t N(0, kλ t ) ( ν λ t IGamma 2, ν ) 2 and for stationarity, 0 < α 1 + β 1 < 1.

The ARCH Model Likelihood Function For ARCH(1) Model GARCH models A Constrained GARCH(1,1) Model The Student-t GARCH(1, 1) Model Session 7: Volatility Modelling 37/ 165 The parameters λ t, t = (1,.., n) modify the model so that Y t σ 2 t St(0, kσ 2 t, ν), where ν takes some positive value, and k is a constant term. For the conditional variance of Y t to be finite, we require ν > 2. Again, choosing a constant term k = (ν 2) ν ensures that the conditional variance of y t remains as σ 2 t, and setting each λ t = 1 recovers the original GARCH(1, 1)

The ARCH Model Likelihood Function For ARCH(1) Model GARCH models A Constrained GARCH(1,1) Model The Student-t GARCH(1, 1) Model Session 7: Volatility Modelling 38/ 165 For 4 < ν <, the conditional kurtosis for the t-garch(1,1) model is 3(ν 2)/(v 4) which is greater than that of a normal. The kurtosis for the Student-t GARCH(1,1) only exists if ν > 4. As ν, the Student density tends to a normal. All odd moments are zero.

The ARCH Model Likelihood Function For ARCH(1) Model GARCH models A Constrained GARCH(1,1) Model The Student-t GARCH(1, 1) Model Session 7: Volatility Modelling 39/ 165 Results for the t-garch(1,1) Model. t-garch(1,1) Median, Std Median, Std Median, Std ν = 5 α 1 β 1 (α 1 + β 1 ) D-THB 0.8672, 0.8677, 0.0148 0.1252, 0.1247, 0.0157 0.9924, 0.9924, 0.0048 D-SGD 0.5928, 0.5948, 0.0361 0.2628, 0.2612, 0.0310 0.8556, 0.8561, 0.0289 D-JPY 0.9507, 0.9521, 0.0107 0.0250, 0.0244, 0.0057 0.9757, 0.9769, 0.0072 D-HKD 0.3955, 0.3947, 0.0275 0.5754, 0.5776, 0.0347 0.9709, 0.9760, 0.0224 D-GBP 0.0118, 0.0057, 0.0165 0.1150, 0.1124, 0.0326 0.1268, 0.1246, 0.0348 D-CHF 0.6865, 0.9027, 0.3607 0.0382, 0.0282, 0.0258 0.7247, 0.9291, 0.3399 D-CAD 0.9337, 0.9352, 0.0140 0.0380, 0.0373, 0.0080 0.9716, 0.9729, 0.0099 D-AUD 0.1207, 0.0097, 0.0347 0.0264, 0.1189, 0.0320 0.1471, 0.1432, 0.0433 H-THB 0.2597, 0.2589, 0.0578 0.4477, 0.4441, 0.0673 0.7074, 0.7062, 0.0670 H-SGD 0.2352, 0.2340, 0.0499 0.4140, 0.4097, 0.0625 0.6493, 0.6488, 0.0579 H-JPY 0.9086, 0.9103, 0.0181 0.0683, 0.0668, 0.0154 0.9769, 0.9775, 0.0084 H-HKD 0.2308, 0.2259, 0.0767 0.4339, 0.4308, 0.0629 0.6646, 0.6644, 0.0732 Posterior statistics of t-garch(1,1) model with ν =5 for 12 FX series.

The ARCH Model Likelihood Function For ARCH(1) Model GARCH models A Constrained GARCH(1,1) Model The Student-t GARCH(1, 1) Model Session 7: Volatility Modelling 40/ 165 The Student-t GARCH(1,1) Model with ν unknown For the Bayesian t-garch(1,1) model, if ν is also to be included as an unknown parameter, inference can also be made about it. t-garch(1,1) ν unknown ν Mean, Median, Std D-THB 6.8834, 6.8834, 0.3640 D-SGD 6.9735, 6.9735, 0.3619 D-JPY 8.3251, 8.3251, 0.5570 D-HKD 5.0460, 5.0460, 0.1667 D-GBP 7.5149, 7.5149, 0.4321 D-CHF 8.7967, 8.7967, 0.6852 D-CAD 9.7992, 9.7992, 0.8881 D-AUD 7.2669, 7.2669, 0.3898 H-THB 6.5218, 6.5218, 0.3756 H-SGD 6.3907, 6.3907, 0.3614 H-JPY 8.1712, 8.1712, 0.6470 H-HKD 6.7072, 6.7071, 0.4109 Posterior statistics for ν in t-garch(1,1) model for 12 FX series

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 41/ 165 The main alternative to ARCH type models is the stochastic volatility (SV), a class of parameter-driven models and allows the variance of the observations to be an unobserved random process. SV models overcome the drawbacks encountered with GARCH models and fit more naturally into the theoretical framework within which much of modern finance theory has been developed. In particular, SV models can easily be seen to have simple continuous-time analogues used for option pricing.

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 42/ 165 The most popular SV model is y t = exp (h t /2) ε t h t = γ + φh t 1 + η t where y t is, as usual, the observation at time t, the ε t s are independent identically distributed (i.i.d.) N (0, 1) random variables, the η t s are also i.i.d. N ( 0, σ 2 η) random variables. The latent process h t can be interpreted as the random and uneven flow of new information into the market, and φ is the persistence in the volatility.

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 43/ 165 Leverage: The advantage of using SV models lies in the fact that they provide greater flexibility in describing stylized facts such as leverage, which causes the conditional variance to respond asymmetrically to rises and falls in y t. More precisely, falling stock prices cause the debt to equity ratio of firms to increase and this entails more uncertainty and in turn increased volatility, whereas rising stock prices decrease a firm s debt to equity ratio, while increasing investor s confidence causing lower levels of volatility. The leverage effect cannot be described by the ARCH or GARCH model, because the conditional variance depends only on the size of lagged y t s and not on their sign; however, it can be captured by the EGARCH model.

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 44/ 165 Persistence in Volatility: A sset returns has been found to have quite high autocorrelations for long lags. The SV model can capture this phenomenon very easily. As has already been mentioned, the parameter φ in the AR process is interpreted as the persistence in the volatility and the restriction φ < 1 is typically imposed to ensure that the series h t of the log-volatilities is stationary. Most studies in the SV literature have found evidence of near unit root behavior of the process h t with values of φ ranging from 0.8 to 0.995 demonstrating that the volatility of asset returns is indeed highly persistent. However, h t can also be allowed to follow a random walk by setting φ = 1.

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 45/ 165 Properties of the Stochastic Volatility Model For simplicity, the error processes, ε t and η t, in the SV model are initially presumed independent. If φ < 1, the process {h t } is strictly stationary with unconditional mean and variance given respectively by µ h = E (h t ) = γ 1 φ σ 2 h = Var (h t ) = σ2 η 1 φ 2. Since y t is the product of two processes, ε t and exp (h t /2), and ε t is always stationary, y t will also be stationary if and only if h t is stationary.

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 46/ 165 Then E (y t ) = E (y t Ψ t 1 ) = 0 so that y t is zero mean and the autocorrelation function (ACF) of y t is ( ( ht ρ yt (τ) = E (y t y t τ ) = E exp 2 + h )) t τ E (ε t ε t τ ) = 0. 2 Thus the series y t is a martingale difference. Furthermore, if the distribution of ε t is symmetric, it follows that all the odd moments of y t are zero.

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 47/ 165 By assumption, exp (h t ) is log-normal distributed, so from standard properties of the log-normal distribution, we have ( E exp (h t ) j) { = exp jµ h + 1 } 2 j2 σ 2 h, so that, if r is even and h t is stationary, all the even moments of y t exist and are given by the formula ( E (yt r ) = E (exp h t ) r/2) { E [(ε t ) r r ] = exp 2 µ h + r 2 } r! 8 σ2 h 2 ( ) r/2 r. 2!

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 48/ 165 In particular Var (y t ) = E ( (y t ) 2) { = exp µ h + 1 } 2 σ2 h and hence if h t is stationary, y t is a white-noise process. The fourth moment is [ E (y t ) 4] = 3 exp { 2µ h + 2σ 2 } h and so the kurtosis for y t is 3 + E ( yt 4 ) ( ) /E y 2 2 ( ( ) ) t = 3 exp σ 2 h 1, which is greater than 0 if σ 2 h is positive. Thus, y t has a leptokurtic, symmetric distribution.

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 49/ 165 The dynamic properties of the SV model appear most clearly if we square y t and take logarithms, so that log y 2 t = h t + log ε 2 t. If ε t has a standard normal distribution, then log ε 2 t has a log-chi-square distribution with mean ψ (1) log 2 1.2704 and variance π 2 /2 4.9348, where ψ ( ) is the digamma function. Thus, if we define ξ t = log ε 2 t + 1.2704, then clearly ξ t is i.i.d. with mean zero and variance π 2 /2 and we may rewrite the model as log y 2 t = 1.2704 + h t + ξ t.

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 50/ 165 Therefore, it follows that log yt 2 is a linear process which is the sum of the AR(1) process h t and white noise. Hence, log yt 2 behaves approximately as an ARMA(1, 1) process, with its ACF being equivalent to that of an ARMA(1, 1) process and given by ρ log y 2 t (τ) = φ τ 1 + ( ), π 2 /2σ 2 τ = 1, 2,.... h The ACF of the powers of the absolute values of y t are also available; the ACF of yt 2 is approximately proportional to that of the AR(1) process h t.

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 51/ 165 When the errors ε t have a Student t-distribution with ν degrees of freedom, y t is also white noise if and only if the process h t is stationary, and if ε t t ν, then and Var (ε t ) = E ( ε 2 t ) = ν/ (ν 2), ν > 2 E ( ε 4 t ) = 3ν 2 / [(ν 2) (ν 4)], ν > 4. Hence, it follows immediately that the unconditional variance of y t generalizes in this case to Var (y t ) = E ( { yt 2 ) ν = ν 2 exp µ h + 1 } 2 σ2 h, ν > 2 The kurtosis for ν > 4 is 3 [ 1 + (ν 2) exp ( σ 2 h) / (ν 4) ].

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 52/ 165 The SV model with ε t t ν can also be transformed into a linear form; let ε t = ζ t κ 1/2 t, where ζ t N (0, 1) and νκ t is independent of ζ t and has a chi-square distribution with ν degrees of freedom. Therefore, log ε 2 t = log ζ 2 t log κ t and it follows that E (log κ t ) = ψ (ν/2) log (ν/2) and Var (log κ t ) = ψ (ν/2) with ψ ( ) and ψ ( ) the digamma and trigamma functions, respectively.

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 53/ 165 Therefore, if we now define ξ t = ( log ζ 2 t + 1.2704 ) + (log κ t ψ (ν/2) + log (ν/2)), then clearly ξ t is i.i.d. with mean zero and variance π 2 /2 + ψ (ν/2). Squaring y t and taking logarithms gives log y 2 t = 1.2704 ψ (ν/2) + log (ν/2) + h t + ξ t, which is again a linear process which adds the i.i.d. ξ t to the AR(1) h t. The ACF is ρ log y 2 t (τ) = φ τ 1 + [ ψ (ν/2) + π 2 /2 ] /σ 2. h

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 54/ 165 Inference for the Stochastic Volatility Model No analytic expression exists for the densities p (y t Ψ t 1 ), and this makes the likelihood function hard to evaluate; the distribution of y t conditional on past information Ψ t 1 does not possess an analytic expression. One way of deriving the likelihood is by integrating the latent log-volatilities out of the joint probability distribution. In particular, denote by y = (y 1,..., y T ) T the vector of observations for T consecutive periods, h = (h 1,..., h T ) T the vector of the corresponding log-volatilities and θ = ( γ, φ, σ 2 η).

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 55/ 165 Then, the likelihood is given by L (y; θ) = p (y, h θ) dh = p (y h, θ) p (h θ) dh. This last integral is of dimension equal to the sample size, T, its evaluation requires the use of numerical procedures and this makes the estimation of the hyperparameters, θ, via the Maximum Likelihood method quite involved.

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 56/ 165 Generalized Method-of-Moments (GMM): The simplest estimation procedure of SV models is the Method-of-Moments. The key advantage of GMM is that it does not require the specification of the likelihood function, but only certain moment conditions are needed. Given a sample of size T, y, the GMM procedure requires the construction of a vector g, whose elements will be the differences between the unconditional expectations and the sample moments. For the SV model there are three parameters we need to estimate, namely θ = ( γ, φ, σ 2 η), and a large number of moments to use.

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 57/ 165 For example, we might define estimating function g with components 1 T 1 T 1 y 2 T t E ( yt 2 ) 1 y 4 T t E ( yt 4 ) y 2 t yt 1 2 E ( yt 2 yt 1) 2. y 2 t yt τ 2 E ( yt 2 yt τ 2 ) where the theoretical values of E(y 2 t y 2 t τ ), for τ 1, can be found analytically.

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 58/ 165 The objective function to be minimized is then Q = g T Wg where W is a (τ + 2) (τ + 2) positive definite, symmetric weighting matrix. The great advantage of the GMM method is simplicity; the main disadvantage is that it is typically inefficient in small samples, although GMM estimators are consistent and asymptotically normal even when the residual errors are non-gaussian. Furthermore, GMM is asymptotically consistent if the observations y t are stationary. When the persistence in the latent process, h t, is high, i.e. φ is close to unity, as is usually the case in practice, the GMM estimator works poorly.

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 59/ 165 There are disadvantages: Estimates can be substantially biased especially for σ 2 η, have large mean squared errors (MSE) when there is high persistence and low coefficient of variation C.V. = Var (exp (h t )) / {E (exp (h t ))} 2 = exp ( σ 2 h) 1. GMM parameter estimates are not invariant to reparameterization ) ψ = f (θ) then ψ f ( θ GMM estimation does not deliver filtered or smoothed estimates of h t

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 60/ 165 Quasi-Maximum Likelihood (QML): The QML method is based on the linearization of the SV model by squaring y t and taking logarithms. Assuming that the errors ε t N (0, 1) and denoting w t = log y 2 t, as has already been seen, the SV model can be written as w t = 1.2704 + h t + ξ t, h t = γ + φh t 1 + η t, where ξ t = log ε 2 t E ( log ε 2 t ), with σ 2 ξ = Var (ξ t ) = π 2 /2. This is a linear but non-gaussian state-space model. The QML approach treats the observation errors, ξ t, as though they were i.i.d N ( 0, π 2 /2 ) and apply the standard Kalman filter.

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 61/ 165 The Kalman filter produces one-step-ahead forecasts of the observations, w t, and the log-volatilities, h t, as well as filtered estimates of the latter. Given a set of observations {y 1,..., y T }, or equivalently { log y 2 1,..., log y 2 T }, the recursions can also be used to construct the Gaussian likelihood of the data via the prediction error decomposition If this Gaussian form of the likelihood is then maximized with respect to the hyperparameters of the model, typically using numerical procedures, it will yield QML estimates of the unknown parameters. Before we proceed with describing the method in more detail, we make one more simplifying transformation of the model,

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 62/ 165 Assume that φ < 1, so that h t is stationary. Taking expectations on both sides of the observation equation, we obtain E(w t ) = γ = 1.2704 + µ h = 1.2704 + γ/ (1 φ). Moreover, if we denote w t = w t γ to be the new observations centered around their unconditional mean and α t = h t µ h be the mean-centered states, then the model can be rewritten as follows w t = α t + ξ t, α t = φα t 1 + η t.

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 63/ 165 The latter state-space model does not explicitly contain the constant term γ of the state-transition equation, and a consistent estimator of γ is given by the sample mean of w t, or equivalently log y 2 t, and is also the QML estimator of γ. Therefore, by applying this last transformation on the SV model, we have managed to concentrate the parameter γ out of the likelihood; we can apply the Kalman filter to the model with the mean centered observations, and obtain the QML estimates of θ = ( φ, σ 2 η) T. Once the estimates φ and σ 2 η are available, the QML estimator of γ will be given by γ = (1 φ) ( 1.2704 + 1 T log y 2 t ).

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 64/ 165 The Kalman filter recursions then compute the one-step-ahead prediction, a t t 1, and the smoothed estimates, a t t, of the unobserved states α t assuming that the observations sequentially become available in the usual way. Initializing with a 0 0 = E (α t ) = 0 P 0 0 = Var (α t ) = σ 2 η/ ( 1 φ 2), the one-step-ahead prediction estimates of α t and their mean square errors (MSEs) are respectively given by a t t 1 = φa t 1 t 1, P t t 1 = φ 2 P t 1 t 1 + σ 2 η, t = 1,..., T, while the filtered estimates, a t t.

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 65/ 165 The MSEs, P t t, are respectively given by a t t = a t t 1 + P t t 1 ft 1 ( ) w t a t t 1 P t t = P t t 1 Pt t 1 2 f t 1, t = 1,..., T, where the terms w t a t t 1 are the innovations in predicting w t given past observations { w t 1,..., w 1 } and ft = P t t 1 + σ 2 ξ are the MSE s of the one-step-ahead prediction estimates of w t. Due to non-gaussianity, the filtered and smoothed estimators a t t 1 and a t t are only minimum mean square linear estimators (MMSLEs) of the unobserved variable α t, given the observations up to time t 1 and t; they are optimal in the class of linear estimators.

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 66/ 165 A Gaussian (quasi) log-likelihood can be constructed l q (θ; w ) = T 2 log (2π) 1 2 T log f t 1 2 t=1 ( ) T w 2 t a t t 1. f t t=1 The resulting QML estimators of θ are consistent with asymptotically normal distribution. The backward recursions produce the smoothed estimates a t T of α t along with their MSE P t T a t T = a t t + φp t t P 1 ( ) t+1 t at+1 T φa t t P t T = P t t + φ 2 P 2 t t P 2 t+1 t ( Pt+1 T P t+1 t ), t = T 1,..., 1.

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 67/ 165 The QML procedure can be applied to the SV model when φ is set equal to one and the log-volatilities are allowed to follow a random walk. When φ = 1, the state-transition equation becomes α t = α t 1 + η t and the linearized SV model becomes a random walk plus noise model for w t with the only unknown parameter being σ 2 η. The Kalman filter prediction and update equations and the recursions need to be initialized with a diffuse prior for α 1, by setting P 1 0 = κ, where κ is some large positive constant and a 1 0 an arbitrary constant.

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 68/ 165 QML estimation is not restricted only to the case when ε t N (0, 1), but with minor modifications can also be used to estimate a SV model with ε t t ν. As before, if φ < 1 and ε t t ν, let ε t = ζ t κ 1/2 t, with νκ t χ 2 ν independent of ζ t N (0, 1), which results in w t = log yt 2, with E (ξ t ) = 0 and σ 2 ξ = Var (ξ t) = π 2 /2 + ψ (ν/2). In addition, wt is obtained from w t by subtracting the unconditional mean γ, in which case it is given by γ = 1.2704 ψ (ν/2) + log (ν/2) + γ/ (1 φ) and thus the state-space form of the model has the same form as above.

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 69/ 165 The QML procedure is inefficient compared to ML, as it approximates the density of a log ( χ 2 1) variable by a normal density. A comparison of these densities (below) illustrates that this approximation is rather inappropriate; the adequacy of the approximation depends critically on the true parameter values For large values of σ 2 η, the AR(1) process, h t, dominates ξ t, the non-gaussian error term in the observation equation, and the normal approximation may be adequate and the QML approach is close to optimal. However, as σ 2 η decreases, the approximation worsens and for small values of σ 2 η, usually found in practice, the QML estimates can be extremely biased and have high root mean square error.

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 70/ 165 0.2 0.15 0.1 0.05 0-10 -5 0 5 10 ( 2 )

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 71/ 165 Example: Simulated a sample of 1,000 values from a SV model with parameters γ = 0, so that µ h = γ/ (1 φ) = 0, φ = 0.9 and σ 2 η = 0.1. The size of the sample is typical for financial data, as are the chosen parameter values. A plot of the likelihood function over a range of values of φ and σ 2 η shows that it is rather flat. For this reason and to avoid convergence difficulties usually encountered with some of the numerical optimization procedures, we use stochastic optimization (simulated annealing) algorithm to find an approximate maximum of the quasi-likelihood function.

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 72/ 165 0.94 φ 0.93 0.92 0.91 0.9 0.89 0.88 0 200 400 600 800 1000 1200 0.16 σ η 2 0.14 0.12 0.1 0.08 0.06 0 200 400 600 800 1000 1200

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 73/ 165 2.5 Volatility Estimates 2 1.5 1 0.5 0 0.5 1 1.5 2 Simulated QML 2.5 0 100 200 300 400 500 600 700 800 900 1000

Properties of the Stochastic Volatility Model Inference for the Stochastic Volatility Model Quasi-Maximum Likelihood (QML) estimation Session 7: Volatility Modelling 74/ 165 Inliers: A drawback to the QML procedure worth noting is the so-called inlier problems encountered by taking logarithms of very small numbers. In particular, when the asset returns, y t, are close to zero log yt 2 is a large negative number and in the extreme case where y t = 0, log yt 2 is not defined. Instead of transforming to w t = log yt 2, it is possible to work with the series ω t = log ( yt 2 + δs 2) δs2 yt 2 + δs 2, where s 2 is the sample variance of y t and δ is a small user-specified constant.

Single-Move MCMC Samplers for the SV Model Multimove MCMC Samplers Session 7: Volatility Modelling 75/ 165 Bayesian Approaches to Inference After we have collected a set of data, Y, which are assumed to have come from a density p ( θ), we can investigate the distribution of the parameters θ given Y using Bayes Theorem. In essence, given the data, we update our degree of belief about θ and obtain a posterior distribution of the parameters, which is denoted p (θ Y) and is given by p (θ Y) = Θ p (Y θ) π (θ) p (Y θ) π (θ), p (Y θ) π (θ) dθ In the SV case, we will explore the posterior distribution using Markov chain Monte Carlo (MCMC).

Single-Move MCMC Samplers for the SV Model Multimove MCMC Samplers Session 7: Volatility Modelling 76/ 165 The estimation of the SV model via MCMC considers the hierarchical structure of conditional distributions. Let θ = ( γ, φ, σ 2 η) T denote the vector of hyperparameters, h = (h 1,..., h T ) T denote the vector of log-volatilities y = (y 1,..., y T ) T the vector of observations, then the hierarchy is specified by the sequence of three conditional distributions.

Single-Move MCMC Samplers for the SV Model Multimove MCMC Samplers Session 7: Volatility Modelling 77/ 165 the distribution of the observations conditional on the log-volatilities, p (y h), the distribution of the log-volatilities conditional on the hyperparameters, p (h θ) the prior distribution of the hyperparameters, p (θ). The joint posterior distribution of the log-volatilities and hyperparameters is p (h, θ y) p (y h) p (h θ) p (θ).

Single-Move MCMC Samplers for the SV Model Multimove MCMC Samplers Session 7: Volatility Modelling 78/ 165 Gibbs sampler for the SV model 1 Choose arbitrary starting values h (0), θ (0) and let i = 0. ( 2 Sample h (i+1) p h y, θ (i)). ( 3 Sample θ (i+1) p θ y, h (i+1)). 4 Set i = i + 1 and goto 1.

Single-Move MCMC Samplers for the SV Model Multimove MCMC Samplers Session 7: Volatility Modelling 79/ 165 Step (2) of the Gibbs algorithm is relatively simple to implement, but sampling from p (h y, θ (i)) is not that straightforward. Single-move algorithms circumvent this difficult ( part of the procedure by decomposing further the density p h y, θ (i)) into the conditionals ( p h t h (i) \t, y, θ(i)) where ( h (i) \t = h (i+1) 1,..., h (i+1) t 1 ) t+1,..., h(i)., h(i) T

Single-Move MCMC Samplers for the SV Model Multimove MCMC Samplers Session 7: Volatility Modelling 80/ 165 Step 2 of the Gibbs Sampler algorithm becomes: 2a. For t = 1,..., T, sample ( p h (i+1) t h t h (i) \t, y, θ(i)) The common feature of all single-move algorithms is that they exploit the Markovian structure of the log-volatilities process; p ( h t h \t, y, θ ) = p (h t h t 1, h t+1, y t, θ) p (y t h t ) p (h t+1 h t, θ) p (h t h t 1, θ), where the second line is deduced from Bayes theorem.

Single-Move MCMC Samplers for the SV Model Multimove MCMC Samplers Session 7: Volatility Modelling 81/ 165 Rejection Metropolis-Hastings: An first approach to the estimation of the SV model via MCMC was offered in the literature using ideas from non-gaussian and non-linear state-space modeling. Consider the parameterization of the SV model: y t = h t ε t, log h t = γ + φ log h t 1 + η t, t = 1,..., T, where ε t and η t are contemporaneously and serially independent random variables with distributions N (0, 1) and N ( 0, σ 2 η), respectively.

Single-Move MCMC Samplers for the SV Model Multimove MCMC Samplers Session 7: Volatility Modelling 82/ 165 In the standard model, with φ < 1, the logarithm of the latent volatilities follows a stationary, Gaussian AR(1) process, so that log h t h t 1, θ N ( γ + φ log h t 1, σ 2 η), which implies that h t h t 1, θ has a log-normal distribution LN ( γ + φ log h t 1, σ 2 η) and in particular, { } p (h t h t 1, θ) 1 exp (log h t γ φ log h t 1 ) 2 h t 2σ 2. η

Single-Move MCMC Samplers for the SV Model Multimove MCMC Samplers Session 7: Volatility Modelling 83/ 165 In addition, noting that y t h t N (0, h t ), it follows that p ( h t h \t, y, θ ) where l t = log h t. 1 h 1/2 t 1 h t exp { exp y t 2 2h t { } (l t+1 γ φl t ) 2 + (l t γ φl t 1 ) 2 2σ 2 η }.