IMES DISCUSSION PAPER SERIES

Similar documents
IMES DISCUSSION PAPER SERIES

Bayesian Analysis of Time-Varying P. Nakajima, Jouchi; Kasuya, Munehisa; Author(s) Toshiaki.

The Time-Varying Effects of Monetary Aggregates on Inflation and Unemployment

Technical Appendix: Policy Uncertainty and Aggregate Fluctuations.

THE EFFECTS OF FISCAL POLICY ON EMERGING ECONOMIES. A TVP-VAR APPROACH

Has Trend Inflation Shifted?: An Empirical Analysis with a Regime-Switching Model

A potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in different subsamples

Bayesian analysis of GARCH and stochastic volatility: modeling leverage, jumps and heavy-tails for financial time series

Modeling Monetary Policy Dynamics: A Comparison of Regime. Switching and Time Varying Parameter Approaches

Oil Price Volatility and Asymmetric Leverage Effects

Calibration of Interest Rates

Chapter 5 Univariate time-series analysis. () Chapter 5 Univariate time-series analysis 1 / 29

Statistical Inference and Methods

A Bayesian Evaluation of Alternative Models of Trend Inflation

Why Does Stock Market Volatility Change Over Time? A Time-Varying Variance Decomposition for Stock Returns

BAYESIAN UNIT-ROOT TESTING IN STOCHASTIC VOLATILITY MODELS WITH CORRELATED ERRORS

Relevant parameter changes in structural break models

Idiosyncratic risk, insurance, and aggregate consumption dynamics: a likelihood perspective

Available online at ScienceDirect. Procedia Economics and Finance 32 ( 2015 ) Andreea Ro oiu a, *

Common Drifting Volatility in Large Bayesian VARs

Properties of the estimated five-factor model

Stochastic Volatility (SV) Models

Application of MCMC Algorithm in Interest Rate Modeling

Introduction Dickey-Fuller Test Option Pricing Bootstrapping. Simulation Methods. Chapter 13 of Chris Brook s Book.

Bayesian Multinomial Model for Ordinal Data

Olga Arratibel und Henrike Michaelis: The Impact of Monetary Policy and Exchange Rate Shocks in Poland: Evidence from a Time-Varying VAR

Week 7 Quantitative Analysis of Financial Markets Simulation Methods

Lecture 9: Markov and Regime

Model 0: We start with a linear regression model: log Y t = β 0 + β 1 (t 1980) + ε, with ε N(0,

2 Control variates. λe λti λe e λt i where R(t) = t Y 1 Y N(t) is the time from the last event to t. L t = e λr(t) e e λt(t) Exercises

Modelling Returns: the CER and the CAPM

Inflation Regimes and Monetary Policy Surprises in the EU

Bayesian Estimation of the Markov-Switching GARCH(1,1) Model with Student-t Innovations

Lecture 8: Markov and Regime

List of tables List of boxes List of screenshots Preface to the third edition Acknowledgements

Chapter 7: Estimation Sections

Keywords: China; Globalization; Rate of Return; Stock Markets; Time-varying parameter regression.

Demographics and the behavior of interest rates

Modeling skewness and kurtosis in Stochastic Volatility Models

Research Memo: Adding Nonfarm Employment to the Mixed-Frequency VAR Model

1. You are given the following information about a stationary AR(2) model:

Financial Econometrics Notes. Kevin Sheppard University of Oxford

TFP Persistence and Monetary Policy. NBS, April 27, / 44

Estimating Macroeconomic Models of Financial Crises: An Endogenous Regime-Switching Approach

ST440/550: Applied Bayesian Analysis. (5) Multi-parameter models - Summarizing the posterior

Identifying Long-Run Risks: A Bayesian Mixed-Frequency Approach

A Multivariate Analysis of Intercompany Loss Triangles

Fractional Integration and the Persistence Of UK Inflation, Guglielmo Maria Caporale, Luis Alberiko Gil-Alana.

A Markov Chain Monte Carlo Approach to Estimate the Risks of Extremely Large Insurance Claims

Cross-Sectional Distribution of GARCH Coefficients across S&P 500 Constituents : Time-Variation over the Period

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2009, Mr. Ruey S. Tsay. Solutions to Final Exam

Identifying Conventional and Unconventional Monetary Policy Shocks: A Latent Threshold Approach

Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS001) p approach

COS 513: Gibbs Sampling

Chapter 5 Univariate time-series analysis. () Chapter 5 Univariate time-series analysis 1 / 59

A Bayesian Evaluation of Alternative Models of Trend Inflation

Common Drifting Volatility in Large Bayesian VARs

A Practical Implementation of the Gibbs Sampler for Mixture of Distributions: Application to the Determination of Specifications in Food Industry

Part II: Computation for Bayesian Analyses

Posterior Inference. , where should we start? Consider the following computational procedure: 1. draw samples. 2. convert. 3. compute properties

RESEARCH ARTICLE. The Penalized Biclustering Model And Related Algorithms Supplemental Online Material

Conditional Heteroscedasticity

The Monetary Transmission Mechanism in Canada: A Time-Varying Vector Autoregression with Stochastic Volatility

Web Appendix. Are the effects of monetary policy shocks big or small? Olivier Coibion

Empirical Analysis of the US Swap Curve Gough, O., Juneja, J.A., Nowman, K.B. and Van Dellen, S.

Financial Econometrics

Credit Shocks and the U.S. Business Cycle. Is This Time Different? Raju Huidrom University of Virginia. Midwest Macro Conference

Consistent estimators for multilevel generalised linear models using an iterated bootstrap

Course information FN3142 Quantitative finance

The Great Moderation Flattens Fat Tails: Disappearing Leptokurtosis

Discussion Paper No. DP 07/05

Chapter 6 Forecasting Volatility using Stochastic Volatility Model

Does Commodity Price Index predict Canadian Inflation?

Extend the ideas of Kan and Zhou paper on Optimal Portfolio Construction under parameter uncertainty

Explaining the Last Consumption Boom-Bust Cycle in Ireland

Market Risk Analysis Volume II. Practical Financial Econometrics

An Implementation of Markov Regime Switching GARCH Models in Matlab

Bayesian analysis of multivariate stochastic volatility with skew distribution

Henrike Michaelis und Sebastian Watzka: Are there Differences in the Effectiveness of Quantitative Easing in Japan over Time?

Estimation of Stochastic Volatility Models : An Approximation to the Nonlinear State Space Representation

Estimation Appendix to Dynamics of Fiscal Financing in the United States

Bayesian Dynamic Factor Models with Shrinkage in Asset Allocation. Duke University

Bayesian Inference for Random Coefficient Dynamic Panel Data Models

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER

Analyzing Oil Futures with a Dynamic Nelson-Siegel Model

Sectoral price data and models of price setting

The relationship between output and unemployment in France and United Kingdom

VAR Models with Non-Gaussian Shocks

Extended Model: Posterior Distributions

Using MCMC and particle filters to forecast stochastic volatility and jumps in financial time series

1 01/82 01/84 01/86 01/88 01/90 01/92 01/94 01/96 01/98 01/ /98 04/98 07/98 10/98 01/99 04/99 07/99 10/99 01/00

Dynamic Factor Volatility Modeling: A Bayesian Latent Threshold Approach

Analysis of Multi-Factor Affine Yield Curve Models

Chapter 2 Uncertainty Analysis and Sampling Techniques

Monetary Policy and Inflation Dynamics in Asset Price Bubbles

Reading the Tea Leaves: Model Uncertainty, Robust Foreca. Forecasts, and the Autocorrelation of Analysts Forecast Errors

Modelling the Sharpe ratio for investment strategies

Window Width Selection for L 2 Adjusted Quantile Regression

Estimating the Natural Rate of Unemployment in Hong Kong

A Macro-Finance Model of the Term Structure: the Case for a Quadratic Yield Model

Transcription:

IMES DISCUSSION PAPER SERIES Time-Varying Parameter VAR Model with Stochastic Volatility: An Overview of Methodology and Empirical Applications Jouchi Nakajima Discussion Paper No. 2-E-9 INSTITUTE FOR MONETARY AND ECONOMIC STUDIES BANK OF JAPAN 2-- NIHONBASHI-HONGOKUCHO CHUO-KU, TOKYO 3-866 JAPAN You can download this and other papers at the IMES Web site: http://www.imes.boj.or.jp Do not reprint or reproduce without permission.

NOTE: IMES Discussion Paper Series is circulated in order to stimulate discussion and comments. Views expressed in Discussion Paper Series are those of authors and do not necessarily reflect those of the Bank of Japan or the Institute for Monetary and Economic Studies.

IMES Discussion Paper Series 2-E-9 March 2 Time-Varying Parameter VAR Model with Stochastic Volatility: An Overview of Methodology and Empirical Applications Jouchi Nakajima* Abstract This paper aims to provide a comprehensive overview of the estimation methodology for the time-varying parameter structural vector autoregression (TVP-VAR) with stochastic volatility, in both methodology and empirical applications. The TVP-VAR model, combined with stochastic volatility, enables us to capture possible changes in underlying structure of the economy in a flexible and robust manner. In that respect, as shown in simulation exercises in the paper, the incorporation of stochastic volatility to the TVP estimation significantly improves estimation performance. The Markov chain Monte Carlo (MCMC) method is employed for the estimation of the TVP-VAR models with stochastic volatility. As an example of empirical application, the TVP-VAR model with stochastic volatility is estimated using the Japanese data with significant structural changes in dynamic relationship between the macroeconomic variables. Keywords: Bayesian inference; Markov chain Monte Carlo; Monetary policy; State space model; Structural vector autoregression; Stochastic volatility; Time-varying parameter JEL classification: C, C5, E52 * Institute for Monetary and Economic Studies, Bank of Japan (Currently in the Personnel and Corporate Affairs Department <studying at Duke University>, E-mail: jouchi.nakajima@stat.duke.edu) The author would like to thank Shigeru Iwata, Han Li, Toshiaki Watanabe, Tomoyoshi Yabu, and the staff of the Institute for Monetary and Economic Studies (IMES), the Bank of Japan, for their useful comments. Views expressed in this paper are those of the author and do not necessarily reflect the official views of the Bank of Japan.

Introduction A vector autoregression (VAR) is a basic econometric tool in econometric analysis with a wide range of applications. Among them, a time-varying parameter VAR (TVP-VAR) model with stochastic volatility, proposed by Primiceri (25), is broadly used, especially in analyzing macroeconomic issues. The TVP-VAR model enables us to capture a possible time-varying nature of underlying structure in the economy in a flexible and robust manner. All parameters in the VAR specification are assumed to follow the first-order random walk process, thus allowing both temporary and permanent shift in the parameters. Stochastic volatility plays an important role in the TVP-VAR model, although the idea of stochastic volatility is originally proposed by Black (976), followed by numerous developments in financial econometrics (see e.g., Ghysels et al. (22), Shephard (25)). In recent years, stochastic volatility is also more frequently incorporated into the empirical analysis in macroeconomics (e.g., Uhlig (997), Cogley and Sargent (25), Primiceri (25)). In many cases, a data generating process of economic variables seems to have drifting coefficients and shocks of stochastic volatility. If that is the case, then application of a model with time-varying coefficients but with constant volatility raises a question that the estimated time-varying coefficients are likely to be biased due to ignoring a possible variation of the volatility in disturbances. To avoid that mis-specification, stochastic volatility is assumed in the TVP-VAR model. Although stochastic volatility makes the estimation difficult because the likelihood function becomes intractable, the model can be estimated using Markov chain Monte Carlo (MCMC) methods in the context of a Bayesian inference. To illustrate the estimation procedure of the TVP-VAR model, this paper begins by reviewing an estimation algorithm for a time-varying parameter (TVP) regression model with stochastic volatility, which is a univariate case of the TVP-VAR model. Then the paper extends the estimation algorithm to the multivariate case. The paper also provides simulation exercises of the TVP regression model to examine its estimation performance against the possibility of structural changes using simulated data. Such simulation exercises show the important role of stochastic volatility in improving the estimation performance. Regarding the empirical application of the TVP-VAR model, this paper provides empirical illustrations using Japanese macroeconomic data. The estimation results for standard three- In that regard, the estimation performance of the TVP-VAR model differ significantly, depending on whether to incorporate the stochastic volatility or not. Thus, we use the expression TVP-VAR model with stochastic volatility, if the inclusion of the stochastic volatility is needed to be emphasized. But, otherwise, we use just TVP-VAR model for simplicity.

variable models reveal the time-varying structure of the Japanese economy and Bank of Japan s monetary policy from 977 to 27. During the three decades of the sample period, the Japanese economy shows significantly different macroeconomic performance, thus implying the possibility of important structural changes in the economy over time. The time-varying impulse responses show remarkable changes of the relations between the macroeconomic variables. The paper is organized as follows. In Section 2, the estimation methodology of the TVP regression model is developed. Section 3 illustrates the simulation study of the TVP regression model focusing on stochastic volatility. In Section 4, the model specification, the estimation scheme and the literature survey of the TVP-VAR model are provided. Section 5 presents the empirical results of the TVP-VAR model for Japanese macroeconomic variables. Finally, Section 6 concludes the paper. 2 TVP regression model with stochastic volatility This section explains the basic estimation methodology of the TVP-VAR model by reviewing an estimation algorithm for a univariate TVP regression model with stochastic volatility. 2. Model Consider the TVP regression model: (Regression) y t = x tβ + z tα t + ε t, ε t N(, σ 2 t ), t =,...,n, () (Time-varying coefficients) α t+ = α t + u t, u t N(, Σ), t =,...,n, (2) (Stochastic volatility) σ 2 t = γ exp(h t ), h t+ = φh t + η t, η t N(, σ 2 η), t =,...,n, (3) where y t is a scalar of response; x t and z t are (k ) and (p ) vectors of covariates respectively; β is a (k ) vector of constant coefficients; α t is a (p ) vector of time-varying coefficients; and h t is stochastic volatility. We assume that α =,u N(, Σ ), γ>, and h =. Equation () has two parts of covariates; one corresponds to the constant coefficients (β) and the other to the time-varying coefficients (α t ). The effects of x t on y t are assumed to be 2

time-invariant, while the regression relations of z t to y t are assumed to change over time. The time-varying coefficients α t are formulated to follow the first-order random walk process in equation (2). It allows both temporary and permanent shifts in the coefficients. The drifting coefficient is meant to capture a possible non-linearity, such as a gradual change or a structural break. In practice, this assumption implies a possibility that the time-varying coefficients capture not only the true movement but also some spurious movements, because the α t can freely move under the random-walk assumption. In other words, there is a risk for the timevarying coefficients to overfit the data if the relations of z t and y t are obscure. To avoid such a situation, it might be better to assume a stationarity for the time-varying coefficients. For example, each coefficient can be modeled to follow an AR() process where the absolute value of the persistence parameter is less than one. However, in this formulation, a structural change or a permanent shift of the coefficient would be difficult to estimate even if it exists. After all, it is important to choose the model specification of the time-varying coefficients which is considered to be suitable to data of interest, economic theories and the purpose of analysis (see e.g., West and Harrison (997)). The disturbance of the regression, denoted by ε t, follows the normal distribution with the time-varying variance σ 2 t. The log-volatility, h t =logσ 2 t /γ, is modeled to follow the AR() process in equation (3). Similar to the discussion on the assumption of the time-varying coefficients above, the process of log-volatility can be modeled following both stationary and non-stationary processes. For the following analysis in this section, we assume that φ < and the initial condition is set based on the stationary distribution as η N(, σ 2 η/( φ 2 )). In the case of φ =, the log-volatility follows the random walk process. The estimation algorithm for the random-walk case requires only a slight modification for the algorithm developed below. 2 We can consider reduced models in the class of the TVP regression model. If the regression has only constant coefficients (i.e., z tα t ), the model reduces to a standard (constantparameter) linear regression model. If we assume that σ 2 t = σ 2,fort =,...,n, the model forms the TVP regression model with the constant variance. 2 The estimation algorithm in the case of φ = is provided in the appendix of Nakajima and Teranishi (29). See also Sekine (26), Sekine and Teranishi (28) for investigation of the macroeconomic issues using the TVP regression model with the random-walk stochastic volatility. 3

2.2 Estimation methodology 2.2. State space model Regarding α t and h t as state variables, TVP regression forms the state space model. The state space model has been well studied in many fields (see e.g., Harvey (993), Durbin and Koopman (22b) for econometric issues). To estimate the state space model, several methods have been developed. For the TVP regression models, if the variance of disturbance is assumed to be time-invariant (i.e., time-varying coefficient and constant volatility), the parameters are easily estimated using the standard Kalman filter for a linear Gaussian state space model (e.g., West and Harrison (997)). Though, if it has stochastic volatility, the maximum likelihood estimation requires a heavy computational burden to repeat the filtering many times to evaluate the likelihood function for each set of parameters until we reach the maximum, because the model forms a non-linear state space model. Therefore, we alternatively take a Bayesian approach using the MCMC method for a precise and efficient estimation of the TVP regression model. It also has a great advantage when the model is extended to the TVP-VAR model as shown later. 2.2.2 Bayesian inference and MCMC sampling method The MCMC method has become popular in econometrics. In recent years, a considerable number of works on empirical macroeconomics employed the MCMC method. The MCMC method is considered in the context of Bayesian inference, and its goal is to assess the joint posterior distribution of parameters of interest under a certain prior probability density which the researchers set in advance. Given data, we repeatedly sample a Markov chain whose invariant (stationary) distribution is the posterior distribution. There are many ways to construct the Markov chain with this property (e.g., Chib and Greenberg (996), Chib (2)). 3 In the Bayesian inference, we specify the prior density, denoted by π(θ), for a vector of the unknown parameters θ. Let f(y θ) denote the likelihood function for data y = {y,...,y n }. Inference is then based on the posterior distribution, denoted by π(θ y), which is obtained by the Bayes theorem, π(θ y) = f(y θ)π(θ) f(y θ)π(θ)dθ. 3 Koop (23) and Lancaster (23) would be helpful for understanding Bayesian econometrics as a primer. Geweke (25), and Gamerman and Lopes (26) cover more comprehensive theories and practices of the MCMC method. 4

In principle, the prior information concerning θ is updated by observing the data y. quantity m(y) = f(y θ)π(θ)dθ is called the normalizing constant or marginal distribution. In the case where the likelihood function or the normalizing constant is intractable, the posterior distribution does not have a closed form. To overcome this difficulty, many computational methods are developed for sampling from the posterior distribution. Among them, the MCMC sampling methods are popular and powerful algorithms which enable us to sample from the posterior distribution without computing the normalizing constant. The MCMC algorithm proceeds by sampling recursively the conditional posterior distribution where the most recent values of the conditioning parameters are used in the simulation. The Gibbs sampler is one of the well-known MCMC method. Consider a vector of unknown parameters θ =(θ,...,θ p ). The procedure is constructed as follows:. Choose an arbitrary starting point θ () =(θ (),...,θ() p ), and set i =. 2. Given θ (i) =(θ (i),...,θ(i) p ), (a) generate θ (i+) from the conditional posterior distribution π(θ (i+) θ (i) 2,...,θ(i) p ), (b) generate θ (i+) 2 from π(θ (i+) 2 θ (i+),θ (i) 3,...,θ(i) p ), (c) generate θ (i+) 3 from π(θ (i+) 3 θ (i+),θ (i+) 2,θ (i) 4,...,θ(i) p ), (d) generate θ (i+) 4,...,θ (i+) p 3. Set i = i +, and go to Step 2.,inthesameway. The These draws can be used as the basis for making inferences by appealing to suitable ergodic theorems for Markov chains. For the estimation of the TVP regression model, there are several reasons to use the Bayesian inference and MCMC sampling method. First, the likelihood function is intractable because the model includes the non-linear state equations of stochastic volatility, which precludes the maximum likelihood estimation method. Also, we can not assess the normalizing constant and therefore the posterior distribution analytically. Second, using the MCMC method, not only the parameters θ (β,σ,φ,σ η,γ) but also the state variables α = {α,...,α n } and h = {h,...,h n } are sampled simultaneously, we can make the inference for the state variables with the uncertainty of the parameters θ. Third, we can estimate the function of the parameters such as an impulse response function with the uncertainty of the parameters θ taken into consideration by using the sample drawn through the MCMC procedure. 5

2.3 MCMC algorithm for the TVP regression model For the TVP regression model, specifying the prior density as π(θ), we obtain the posterior distribution, π(θ, α, h y). 4 There are several ways to implement the MCMC algorithm to explore this posterior distribution, though we develop the implementation using the following algorithm: MCMC algorithm for the TVP regression model. Initialize θ, α and h. 2. Sample β γ,α,h,y. 3. Sample α β,σ,γ,h,y. 4. Sample Σ α. 5. Sample h β,γ,φ,σ η,α,y. 6. Sample φ σ η,h. 7. Sample σ η φ, h. 8. Sample γ β,α,h,y. 9. Go to 2. The details of the procedure are illustrated as follows. Sample β We specify the prior for β as β N(β,B ). We explore the conditional posterior density of β given by { π(β γ,α,h,y) exp } { n 2 (β β ) B (β β t= ) (y t x tβ z tα t ) 2 } 2γe { ht exp } 2 (β ˆβ) ˆB (β ˆβ), 4 Appendix A. provides the functional form of the joint posterior distribution. 6

where ˆB = ( B + n t= x t x t γe ht ), ˆβ = ˆB ( B β + n t= x t ŷ t γe ht and ŷ t = y t z tα t,fort =,...,n. The conditional posterior density is proportional to the kernel of the normal distribution whose mean and variance are ˆβ and ˆB, respectively. Then, we draw a sample as β γ,α,h,y N(ˆβ, ˆB). ), Sample α We consider how to sample α from its conditional posterior distribution. Regarding α as the state variable, the model given by () and (2) forms the linear Gaussian state space model. Given the parameters (β,σ,γ,h), a primitive way to sample α is to assess the conditional posterior density of α t given (β,σ,γ,h,y,α t ), where α t is the α excluding α t, i.e., α t = (α,...,α t,α t+,...,α n ). This way of sampling is often called as a single-move sampler. The single-move sampler is quite simple, but inefficient in the sense that the autocorrelation of the MCMC sample often goes extremely high. For instance, after the α t is sampled given α t (including α t+ ), the α t+ is sampled given α t+ (including the α t, which has been just drawn). The recursive chain depending on both sides of the sampled state variable yields an undesirable high autocorrelation. If the MCMC sample has a high autocorrelation, the convergence of the Markov chain is slow and an inference requires considerably many samples. To reduce the sample autocorrelation for α, we introduce the simulation smoother developed by de Jong and Shephard (995), Durbin and Koopman (22a). It enables us to sample α simultaneously from the conditional posterior distribution π(α β,σ,γ,h,y), which can reduce the autocorrelation of the MCMC sample. Following de Jong and Shephard (995), we show the algorithm of the simulation smoother on the state space model y t = X t β + Z t α t + G t u t, t =,...,n, α t+ = T t α t + H t u t, t =,...,n, (4) where α =,u t N(,I), and G t H t = O. The simulation smoother draws η =(η,...,η t ) π(η ω, y), where η t = H t u t,fort =,...,n,andω denotes all the parameters in the model. 7

We initialize a =,P = H H, and recursively run the Kalman filter: e t = y t X t β Z t a t, D t = Z t P t Z t + G t G t, K t = T t P t Z td t, L t = T t K t Z t, a t+ = T t a t + K t e t, P t+ = T t P t L t + H t H t, for t =,...,n. Then, letting r n = U n =,andλ t = H t H t, we run the simulation smoother: C t = Λ t Λ t U t Λ t, η t = Λ t r t + ε t, ε t N(,C t ), V t = Λ t U t L t, r t = Z td t e t + L tr t V t C t ε t, U t = Z td t Z t + L tu t L t + V t C t V t, for t = n, n,...,. For the initial state, we draw η =Λ r + ε, ε N(,C )with C = Λ Λ U Λ. Once η is drawn, we can compute α t using the state equation (4), replacing H t u t by η t. In the case of the TVP regression model to sample α, the correspondence of the variables is as follows: X t β = x tβ, Z t = z t, G t =( γe ht/2, p), T t = I p, H t =( p, Σ /2 ), H =( p, Σ /2 ), where p is a p zero vector, and I p is a p p identity matrix. Sample Σ We derive the conditional posterior density of Σ. If we specify the prior as Σ IW(ν, Ω ), where IW denotes the inverse-wishart distribution, we obtain the conditional posterior distribution for Σ as π(σ α) Σ ν +p+ 2 exp { 2 tr ( Ω Σ )} n Σ t= /2 exp Σ ˆν+p+ 2 exp { } 2 (α t+ α t ) (α t+ α t ) { 2 tr (ˆΩΣ )}, (5) 8

where n ˆν = ν + n, ˆΩ =Ω + (α t+ α t )(α t+ α t ). Note that the posterior distribution for Σ depends on only α and (5) forms the kernel of the inverse-wishart distribution. Then, we draw sample as Σ α IW(ˆν, ˆΩ ). t= Sample h Regarding stochastic volatility h, the equations () and (3) form a non-linear and non-gaussian state space model. We need more technical methods for sampling h. A simple way of sampling h is to assess the conditional posterior distribution of h t given (h,...,h t,h t+,...,h n )and other parameters. This method is called a single-move sampler, similarly to sampling α, and yields an undesirable high autocorrelation in MCMC sample. There are mainly two efficient methods for sampling stochastic volatility developed in the literature. One way to sample stochastic volatility is the approach of Kim et al. (998), called the mixture sampler. The mixture sampler has been widely used in financial and macroeconomics literature (Cogley and Sargent (25), Primiceri (25)). The other way is the multi-move sampler of Shephard and Pitt (997), modified by Watanabe and Omori (24). The idea of the former method is to approximate the non-linear and non-gaussian state space model by the normal mixture distribution, converting the original model to the linear Gaussian state space form. Though we draw samples from the posterior distribution based on the approximated model, its approximation error is small enough to implement the original model, and can be corrected by reweighting steps, as discussed by Kim et al. (998), and Omori et al. (27). On the other hand, the latter algorithm approaches to the model by drawing samples from the exact posterior distribution of the original model. Both methods are more efficient to draw samples of stochastic volatility than a single-move sampler, while we use the latter one in this paper. The details of the multi-move sampler are illustrated in Appendix A.2. Sample φ We write the prior of φ as π(φ), and assume that (φ +)/2 Beta(α φ,β φ ). This beta distribution is chosen to satisfy the restriction φ <. The conditional posterior distribution 9

of φ is given by { π(φ σ η,h) π(φ) φ 2 exp ( φ2 )h 2 } { } n t= 2σ 2 exp (h t+ φh t ) 2 η 2σ 2 η π(φ) φ 2 exp n t=2 h2 t 2σ 2 η ( φ ) n 2 t= h th t+ n t=2 h2. t The conditional posterior density does not form any basic distribution from which we can easily sample. If the term π(φ) φ 2 is omitted, the rest of the term corresponds to a kernel of the normal distribution. In this case, we use the Metropolis-Hasting (MH) algorithm (e.g., Chib and Greenberg (995)). The idea of the MH algorithm is as follows. First, we draw samples (which we call candidates) from a certain distribution (proposal distribution) that is close to the conditional posterior distribution we want to sample from. We had better choose the proposal distribution whose random sample can be easily generated. Next, we accept the candidate as a new sample with a certain probability. When the candidate is rejected, we use the old (current) sample we have just drawn in the previous iteration as the new sample. Under certain conditions, the iterations of these steps produce the sample from the target conditional posterior distribution (see e.g., Chib and Greenberg (995)). There are many ways to choose the proposal density, which often depends on the target conditional posterior distribution. Specifically, let q(θ θ (i) ) denote the probability density function of the proposal given the current point θ (i),andα(θ,θ ) denote the acceptance rate from the current point θ to the proposal θ. The MH algorithm is written as the following algorithm:. Choose an arbitrary starting point θ (), and set i =. 2. Generate a candidate θ from the proposal q(θ θ (i) ). 3. Accept θ with the probability α(θ (i),θ ), and set θ (i+) = θ. Otherwise, set θ (i+) = θ (i). 4. Set i = i +, and go to Step 2. The acceptance rate is given by { α(θ,θ )=min, π(θ y)q(θ θ } ) π(θ y)q(θ, θ ) where π(θ y) denotes the target posterior distribution.

To sample φ in our model, we first draw a candidate as φ TN [,] (μ φ,σ 2 φ ), where TN refers to the truncated normal distribution on the domain <φ<, and μ φ = n t= h th t+ n t=2 h2 t, σ 2 φ = σ 2 η n. t=2 h2 t This proposal density is the one excluding the term π(φ) φ 2 from the conditional posterior distribution, considered to be close to our target conditional posterior distribution and truncated for the same domain of the target. Next, we calculate the probability for acceptance. Let q(φ) denote the probability density function of the proposal and φ denote the old sample (current point) drawn in the previous iteration. The acceptance rate for the candidate φ from the current point φ, denoted by α(φ,φ ), is given by { } α(φ,φ )=min, π(φ σ η,h)q(φ ) π(φ σ η,h)q(φ =min ), π(φ ) φ 2 π(φ ) φ 2. The acceptance rate is the ratio of the terms omitted from the conditional posterior distribution. The acceptance step can be implemented by drawing a uniform random number u U(, ) to accept the candidate φ when u<α(φ,φ ). Sample σ η We assume the prior of σ η as σ 2 η IG(v /2,V /2), where IG refers to the inverse Gamma distribution. The conditional posterior distribution for σ η is obtained as ( π(σ η φ, h) σ ( v 2 +) η exp V 2σ η ) { exp ( φ2 )h 2 σ η σ ( v +n η 2 +) exp { 2σ 2 η } n t= σ η exp { (h t+ φh t ) 2 } 2σ 2 η V +( φ 2 )h 2 + n t= (h t+ φh t ) 2 2σ η The conditional posterior distribution forms the kernel of the inverse Gamma distribution. Thus, we draw samples as σ 2 η φ, h IG(ˆv/2, ˆV/2), where n ˆv = v + n, ˆV = V +( φ 2 )h 2 + (h t+ φh t ) 2. t= }.

Sample γ Sampling γ can be implemented in the same way as sampling σ η. γ IG(γ /2,W /2). IG(ˆγ/2, Ŵ/2), where 3 Simulation study We set the prior as Then, the conditional posterior distribution for γ is given by γ h ˆγ = γ + n, Ŵ = W + n (y t x tβ z tα t ) 2 /e ht. t= This section carries out simulation exercises of the TVP regression model to examine its estimation performance against the possibility of structural changes using simulated data, with emphasis on the role of stochastic volatility. 3. Setup The performance of the proposed estimation method for the TVP regression model is illustrated using simulated data. In this simulation study, we investigate how the parameters are estimated, and how the assumption of stochastic volatility affects the estimates of other parameters. Based on the TVP regression model of equations () (3) with n =, k = 2 and p =2, we generate {x t } n t= and {z t} n t= as x it U(.5,.5), z jt U(.5,.5) for i, j =, 2, where x t =(x t,x 2t ), z t =(z t,z 2t ),andu(a, b) denotes the uniform distribution on the domain (a, b). Setting the true parameters as β =(4, 3), α =(, ),Σ=diag(.,.3), φ =.95, σ η =.7, and γ =., where diag( ) refers to a diagonal matrix with the diagonal elements in the arguments, we generate α, h and y recursively on the TVP regression model. The simulated state variables α and h are plotted in Figure. The volatility temporarily goes high around t = 2. 3.2 Parameter estimates We estimate the TVP regression model using the simulated data by drawing M = 2, samples, after the initial 2, samples are discarded by assuming the following prior distribu- 2

2 α t t 2 3 4 5 6 7 8 9 α 2t 5. 2 3 4 5 6 7 8 9 σ t =exp(h t /2) 2.5 2 3 4 5 6 7 8 9 Figure : The simulated state variables α and h (n = ). tions: 5 β N(, I), Σ IW(4, 4 I), α N(, I), φ + 2 Beta(2,.5), σ 2 η IG(2,.2), γ IG(2,.2). Figure 2 shows the sample autocorrelation function, the sample paths and the posterior densities for the selected parameters. After discarding the samples in the burn-in period (initial 2, samples), the sample paths look stable and the sample autocorrelations drop stably, indicating our sampling method efficiently produces the samples with low autocorrelation. Table gives the estimates for posterior means, standard deviations, the 95% credible intervals, 6 the convergence diagnostics (CD) of Geweke (992) and inefficiency factors, which are computed using the MCMC sample. 7 In the estimated result, the null hypothesis of 5 The computational results are generated using Ox version 4.2 (Doornik (26)). All the codes for the algorithms illustrated in this paper are available at http://sites.google.com/site/jnakajimaweb/program. 6 In Bayesian inference, we use credible intervals to describe the uncertainty of the parameters, instead of confidence intervals in the frequentist approach. In MCMC analysis, we usually report the 2.5% and 97.5% quantiles of posterior draws, as taken here. 7 To check the convergence of the Markov chain, Geweke (992) suggests the comparison between the first n draws and the last n draws, dropping out the middle draws. The CD statistics is computed by CD = ( x x ˆσ )/ 2 /n +ˆσ 2 /n, where x j = mj +n j n j i=m j x (i), x (i) is the i-th draw, and ˆσ 2 j/n j is the standard error of x j respectively for j =,. If the sequence of the MCMC sampling is stationary, it converges in distribution to a standard normal. We set m =,n =,, m =5,, and n =5,. The ˆσ 2 j is computed using Parzen window with bandwidth, B m = 5. The inefficiency factor is defined as + 2 B m s= ρ s, 3

β β 2 Σ Σ 22 φ σ η γ 25 5 25 5 25 5 25 5 25 5 25 5 25 5 β β 2.5 2 Σ.2 Σ 22 φ σ. η γ.6 4.25..75 2.75.4.95 4.. 3..5.5.2 3.75 3.25.9.25 2 2 2 2 2 2 2 β β 3 2 Σ 2 Σ 22 φ σ 3 η γ 3 4 2 4 2 2 2 2 2 3 4 3 2..2...9...2.4 Figure 2: Estimation results of the TVP regression model (with stochastic volatility) for the simulated data. Sample autocorrelations (top), sample paths (middle) and posterior densities (bottom). the convergence to the posterior distribution is not rejected for the parameters at the 5% significance level based on the CD statistics, and the inefficiency factors are quite low except γ, which indicates an efficient sampling for the parameters and state variables. Even for the γ, the inefficiency factor is about, which implies that we obtain about M/ = 2 uncorrelated samples. It is considered to be enough for the posterior inference. In addition, the estimated posterior mean is close to the true value of the parameter, and the 95% credible intervals include it for each parameter listed in Table (i). 3.3 The role of stochastic volatility To assess the function of stochastic volatility in the TVP regression model, we estimate the TVP regression model with constant volatility for the same simulated data. Because the where ρ s is the sample autocorrelation at lag s, which is computed to measure how well the MCMC chain mixes (see e.g., Chib (2)). It is the ratio of the numerical variance of the posterior sample mean to the variance of the sample mean from uncorrelated draws. The inverse of the inefficiency factor is also known as relative numerical efficiency (Geweke (992)). When the inefficiency factor is equal to m, we need to draw the MCMC sample m times as many as the uncorrelated sample. 4

(i) TVP regression model with stochastic volatility Parameter True Mean Stdev. 95% interval CD Inefficiency β 4. 4.55.66 [3.7837, 4.244].833 2.46 β 2-3. -2.8668.37 [-3.49, -2.69].99 4.37 Σ..44.33 [.96,.22].44 38.2 Σ 22.3.2.68 [.43,.656].27 57.5 φ.95.9735.97 [.9224,.9967].895 52.39 σ η.5.458.84 [.288,.757].56 33.55 γ..445.5 [.52,.865].98 6.44 (ii) TVP regression model with constant volatility Parameter True Mean Stdev. 95% interval CD Inefficiency β 4. 4.2373.38 [3.6256, 4.8447].472.3 β 2-3. -2.776.3369 [-3.488, -2.54].398.52 Σ..73.26 [.29,.689].533 68.5 Σ 22.3.23.33 [.25,.444].36 7.39 σ.945.688 [.825,.922].456.87 Table : Estimation results of the TVP regression model for the simulated data with (i) stochastic volatility and (ii) constant volatility. The true model is stochastic volatility. true specification is stochastic volatility, we investigate how the estimation result changes by the mis-specification. As mentioned in Section 2., constant volatility is specified by σ 2 t = σ 2,fort =,...,n. If we assume the prior as σ 2 IG(s /2,S /2), then the conditional posterior distribution of σ is given by σ 2 β,α,y IG(ŝ/2, Ŝ/2), where ŝ = s + n, and Ŝ = S + n t= (y t x tβ z tα t ) 2. For the MCMC algorithm for the TVP regression model, Steps 5 8 are replaced by the step of sampling σ for constant volatility. In the simulation study, the prior σ 2 IG(2,.2) is additionally assumed, and the estimation procedure is the same as the TVP regression model with stochastic volatility discussed above. Table (ii) reports the estimation results of the TVP regression model with constant volatility for the simulated data. The standard deviations of (β,β 2 ) are evidently wider than the stochastic volatility model, and the posterior means are slightly apart from the true value. The posterior means of (Σ, Σ 22 ) are estimated lower than the stochastic volatility model. We check how the time-varying coefficients are estimated. In addition to the above two models, the constant coefficient and constant volatility model is estimated. The posterior estimates of α are plotted in Figure 3. Figure 3(i) clearly shows that the constant coefficient model is unable to capture the time variation of the coefficients, and the posterior mean is estimated around the averaged level of time-varying coefficients over time. Figure 3(ii) plots the estimates based on the same time-invariant model with structural breaks. To detect a 5

(i) Constant coefficient & Constant volatility (ii) Constant coefficient & Constant volatility (with break) α t 2 α t 2 t t 2 3 4 5 6 7 8 9 2 3 4 5 6 7 8 9 α 2t α 2t 2 2 2 3 4 5 6 7 8 9 2 3 4 5 6 7 8 9 (iii) Time-varying coefficient & Constant volatility (iv) Time-varying coefficient & Stochastic volatility α t 2 α t 2 t t α 2t 2 3 4 5 6 7 8 9 α 2t 2 3 4 5 6 7 8 9 2 2 2 3 4 5 6 7 8 9 2 3 4 5 6 7 8 9 Figure 3: Estimation results of α on the TVP regression model for the simulated data. True value (solid line), posterior mean (bold) and 95% credible intervals (dashed). The true model is the time-varying coefficient and stochastic volatility (iv). possible break, the CUSUM of square test proposed by Brown et al. (975) is applied to divide the sample period into three parts (t = 9, 2 8, 82 ). Then, the constant coefficient and constant volatility model is estimated for each subsample period. 8 In the first and second subsample periods, the posterior 95% credible intervals are primarily wide due to the high volatility of the disturbance. In the third subsample period, the posterior means seem to follow the average level of time-varying coefficient over each subsample period and the 95% credible intervals are narrower. However, the true states are not traced well. 8 Modeling structural changes is one of the central issues of recent econometrics (see e.g., Perron (26)). As well as the time-varying coefficients and stochastic volatility, structural changes can assess possible changes in the underlying data generation process. Whether a true model has a structural break or time-varying parameters such as the one in this paper, both models are intended to capture it by approximating its behavior in each way. 6

Figure 3(iii) exhibits the estimation results for the TVP regression with constant volatility. The posterior means seem to follow the true states of the time-varying coefficients to some extent. However, for α t, some true values do not drop in the 95% credible intervals. On the other hand, for α 2t, the intervals are too wide to capture the movement of the true value. The constant volatility model neglects the behavior of the volatility change and lacks the accuracy of estimates for α it. The estimates of the TVP regression with stochastic volatility, which is the true model, are plotted in Figure 3(iv). The posterior means trace the movement of the true values and the 95% credible intervals tend to be narrower overall than the constant volatility model, and almost include the true values. The simulation analysis here refers to a profound issue of identifying the source of shock. Focusing on the third case, the estimated constant variance (σ) of the disturbance is smaller in the first-half period and larger in the second-half than the true state of stochastic volatility because the constant variance captures the average level of volatility. For the first-half period, the 95% credible intervals are almost as wide as the stochastic volatility model, although the posterior mean is less accurate with respect to the distance between the estimated posterior means and true values, because the shock to the disturbance is estimated smaller than the true state and the rest of the shock is drawn up to the drifting α it in a mis-specified way. On the other hand, for the second-half period, the posterior mean of the constant volatility model is relatively accurate compared to the first-half period, but the 95% credible intervals are wider than the stochastic volatility model, because the constant volatility is over-estimated and the vagueness remains in the drifting α it. 3.4 Other models In addition, other interesting simulations in which the true model is not the TVP regression form with time-varying coefficient and stochastic volatility are examined. First, data is simulated from the TVP regression model with constant coefficient and stochastic volatility. The true values are the same as the previous simulation study except α t = and α 2t =, for all t =,...,n. The TVP regression model with time-varying coefficient and stochastic volatility is estimated to examine how the time-varying coefficient follows the time-invariant true state. The estimation results of (α t,α 2t ) are shown in Figure 4(i). Though the estimates of the posterior means are not perfectly time-invariant, they are moving near the true states and the 95% credible intervals include the true value throughout the sample periods. Second, data is simulated from the TVP regression model with stochastic volatility but the 7

(i) Constant coefficient & Stochastic volatility (ii) Markov-switching coefficient & Stochastic volatility.5 α t 2 α t..5 2. t t 2 3 4 5 6 7 8 9 2 3 4 5 6 7 8 9 α 2t α 2t..5..5 2 2 3 4 5 6 7 8 9 2 3 4 5 6 7 8 9 Figure 4: Estimation results of α on the TVP regression model for the simulated data. True value (solid line), posterior mean (bold) and 95% credible intervals (dashed). The True models are (i) constant coefficient and stochastic volatility, and (ii) Markov-switching coefficient and stochastic volatility. The TVP regression model with time-varying coefficient and stochastic volatility is fitted. time-varying coefficients (α t,α 2t ) modeled to have the Markov-switching structural change. A lot of literature considers the Markov-switching type of time-varying parameters in macroeconomic issues. We assume that α t and α 2t have two regimes (α () t,α() t ) = (, ) and (α () 2t,α() 2t )=(, ), respectively. The coefficients (α t,α 2t ) switch independently with the transition probabilities p(α it = α (j) it α i,t = α (j) i,t )=.98, for i =, 2andj =,. The TVP regression model with time-varying coefficient (of the original form) and stochastic volatility is estimated to examine how the time-varying coefficient follows the Markov-switching structural change. Figure 4(ii) plots the estimation results of the coefficients. The true states of α t and α 2t have two and one breaks, respectively. For both coefficients, the 95% credible intervals include the true values. Around the structural breaks, the posterior means of the coefficients follow the true states to some extent, although their movements would not be so responsive, especially for α 2t. The degree of adjustment to the structural change depends on the size of the volatility of disturbance in regression. The posterior estimates tend to smooth the true states of the coefficients. The simulations in this section are just one case of generated data for each setting. However, the estimation results show the flexibility and the applicability of the TVP regression models, which would help us to understand the importance of the time-varying parameters in the regression models. 8

4 Time-varying parameter VAR with stochastic volatility This section extends the estimation algorithm for a univariate TVP estimation model to a multivariate TVP-VAR model. 4. Model To introduce the TVP-VAR model, we begin with a basic structural VAR model defined as Ay t = F y t + + F s y t s + u t, t = s +,...,n, (6) where y t is an k vector of observed variables, A, F,...,F s are k k matrices of coefficients. The disturbance u t is a k structural shock and, we assume that u t N(, ΣΣ), where σ....... Σ =........ σ k We specify the simultaneous relations of the structural shock by recursive identification, assuming that A is lower-triangular, A = a 2.............. a k a k,k. We rewrite model (6) as the following reduced form VAR model: y t = B y t + + B s y t s + A Σε t, ε t N(,I k ), where B i = A F i,fori =,...,s. Stacking the elements in the rows of the B i s to form β (k 2 s vector), and defining X t = I s (y t,...,y t s), where denotes the Kronecker product, the model can be written as y t = X t β + A Σ ε t. (7) 9

Now, all parameters in equation (7) are time-invariant. We extend it to the TVP-VAR model by allowing the parameters to change over time. Consider the TVP-VAR model stochastic volatility specified by y t = X t β t + A t Σ t ε t, t = s +,...,n, (8) where the coefficients β t, and the parameters A t,andσ t are all time varying. 9 There would be many ways to model the process for these time-varying parameters. Following Primiceri (25), let a t = (a 2,a 3,a 32,a 4,...,a k,k ) be a stacked vector of the lower-triangular elements in A t and h t =(h t,...,h kt ) with h jt =logσ 2 jt,forj =,...,k, t = s +,...,n.we assume that the parameters in (8) follow a random walk process as follows: β t+ = β t + u βt, a t+ = a t + u at, h t+ = h t + u ht, ε t u βt u at u ht I O O O N, O Σ β O O, O O Σ a O O O O Σ h for t = s +,...,n,whereβ s+ N(μ β, Σ β ), a s+ N(μ a, Σ a )andh s+ N(μ h, Σ h ). Several remarks are required for the specification of the TVP-VAR model. First, the assumption of a lower-triangular matrix for A t is recursive identification for the VAR system. This specification is simple and widely used, although an estimation of structural models may require a more complicated identification to extract implications for the economic structure, as pointed out by Christiano et al. (999) and other studies. In this paper, the estimation algorithm is explained in the model with recursive identification for simplicity, although the estimation procedure is applicable for the model with non-recursive identification by a slight modification of the variable in the MCMC algorithm. Second, the parameters are not assumed to follow a stationary process such as AR(), but the random walk process. As mentioned before, because the TVP-VAR model has a number of parameters to estimate, we had better decrease the number of parameters by assuming the random walk process for the innovation of parameters. Most of studies that use the TVP- VAR model assume the random walk process for parameters. Note that the extension of the estimation algorithm to the case of stationary process is straightforward. 9 Time-varying intercepts are incorporated in some literature on the TVP-VAR models. This case requires only the modifiction of defining X t := I s (, y t,...,y t s). Hereafter, we use the TVP-VAR model to indicate that model with stochastic volatility for simplicity. 2

Third, the variance and covariance structure for the innovations of the time-varying parameters are governed by the parameters, Σ β,σ a and Σ h. Most of the articles assume that Σ a is a diagonal matrix. In this paper, we further assume that Σ h is also a diagonal matrix for simplicity. The experience of several estimations indicate that this diagonal assumption for Σ h is not sensitive for the results, compared to the non-diagonal assumption. Fourth, when the TVP-VAR model is implemented in the Bayesian inference, the priors should be carefully chosen because the TVP-VAR model has many state variables and their process is modeled as a non-stationary random walk process. The TVP-VAR model is so flexible that the state variables can capture both gradual and sudden changes of the underlying economic structure. On the other hand, allowing time variation in every parameter in the VAR model may cause an over-identification problem. As mentioned by Primiceri (25), the tight prior for the covariance matrix of the disturbance in the random walk process would avoid the implausible behaviors of the time-varying parameters. The time-varying coefficient (β = (β s+,...,β n )) would require a tighter prior than the simultaneous relations (a = (a s+,...,a n )) and the volatility (h =(h s+,...,h n )) of the structural shock for the variance of the disturbance in their time-varying process. The structural shock we consider in the model unexpectedly hits the economic system and its size would more widely fluctuate over time than the possible change of the autoregressive system of the economic variables specified by VAR coefficients. In most of the related literature, a tighter prior is set for Σ β and a rather diffuse prior for Σ a and Σ h. A prior sensitivity analysis would be necessary to check the robustness of the empirical result with respect to the prior tightness. Finally, the prior of the initial state of the time-varying parameters is specified. When the time series model is a stationary process, we often assume the initial state following a stationary distribution of the process (for instance, h N(,σ 2 η/( φ 2 )) in the TVP regression model). However, our time-varying parameters are random walks; thus, we specify a certain prior for β s+, a s+,andh s+. We would have two ways to set the prior. First, following Primiceri (25), we set a prior of normal distribution whose mean and variance chosen based on the estimates of a constant parameter VAR model computed using the pre-sample period. It is reasonable to use the economic structure estimated from the pre-sample period up to the initial period of the main sample data. Second, we can set a reasonably flat prior for the initial state from the standpoint that we have no information about the initial state a priori. Koop and Korobilis (2) provide a comprehensive discussion on the methodology for the TVP-VAR model including the issues about the prior specifications. 2

4.2 Estimation methodology The estimation procedure for the TVP-VAR model is illustrated by extending several parts of the algorithm for the TVP regression model. Let y = {y t } n t=,andω =(Σ β, Σ a, Σ h ). We set the prior probability density as π(ω) forω. Given the data y, we draw samples from the posterior distribution, π(β, a, h,ω y), by the following MCMC algorithm:. Initialize β, a, h and ω. 2. Sample β a, h, Σ β, y. 3. Sample Σ β β. 4. Sample a β, h, Σ a, y. 5. Sample Σ a a. 6. Sample h β, a, Σ h, y. 7. Sample Σ h h. 8. Go to 2. The details of the procedure are illustrated as follows. Sample β To sample β from the conditional posterior distribution, the state space model with respect to β t as the state variable is written as y t = X t β t + A t Σ t ε t, t = s +,...,n, β t+ = β t + u βt, t = s,...,n, where β s = μ β,andu βs N(, Σ β ). We run the simulation smoother with the correspondence of the variables to equation (4) is as follows: X t β = k, Z t = X t, G t =(A t Σ t,o kβ ), T t = I kβ, H t =(O k, Σ /2 β ), H =(O k, Σ /2 β ), where k β is the number of rows of β t. 22

Sample a To sample a from the conditional posterior distribution, the expression of the state space form with respect to a t is a key to implement the simulation smoother. Specifically, ŷ t = ˆX t a t +Σ t ε t, t = s +,...,n, a t+ = a t + u at, t = s,...,n, where a s = μ a, u as N(, Σ a ), ŷ t = y t X t β t,and ˆX t = ŷ t. ŷ t ŷ 2t ŷ t...., ŷ t ŷ k,t for t = s +,...,n. We run the simulation smoother for sampling a with the correspondences: where k a is the number of rows of a t. X t β = k, Z t = ˆX t, G t =(Σ t,o ka ), T t = I ka, H t =(O k, Σ /2 a ), H =(O k, Σ /2 a ), Sample h As for stochastic volatility h, we make the inference for {h jt } n t=s+ separately for j (=,...,k), because we assume Σ h and Σ h are diagonal matrices. Let yit denote the i-th element of A tŷ t. Then, we can write: y it = exp(h it /2)ε it, t = s +,...,n, h i,t+ = h it + η it, t = s,...,n, ) ( ( )) N,, η it vi 2 ( εit 23

where η is N(, v 2 i ), and v 2 i and v2 i are the i-th diagonal elements of Σ h and Σ h, respectively, and η it is the i-th element of u ht. Wesample(h i,s+,...,h in ) using the multi-move sampler developed in Appendix A.2. Sample ω Sampling Σ β from its conditional posterior distribution is the same way to sampling Σ in the TVP regression model. Sampling the diagonal elements of Σ a and Σ h is also the same way to sample σ η of the TVP regression model. When the prior is the inverse Gamma distribution, so is the conditional posterior distribution. 4.3 Literature The econometric analysis using the VAR model was originally developed by Sims (98). Numerous numbers of studies have been investigated in this context, and it has become a standard econometric tool of macroeconomics literature (see e.g., Leeper et al. (996), Christiano et al. (999) for more broad literature survey). Since the late of the 99s, the time-varying components have been incorporated into the VAR analysis. A salient analysis using the VAR model with time-varying coefficients was developed by Cogley and Sargent (2). They estimate a three-variable VAR model (inflation, unemployment and nominal short-term interest rates), focusing on the persistence of inflation and the forecasts of inflation and unemployment for post-war U.S. data. The dynamics of policy activism is also discussed based on their time-varying VAR model. Among the discussions on their results, Sims (2) and Stock (2) questioned the assumption of the constant variance (a and h in our notation) for VAR s structural shock, and were concerned that the results for the drifting coefficients of Cogley and Sargent (2) might be exaggerated due to neglecting a possible variation of the variance. 2 Replying to them, Cogley and Sargent (25) incorporated stochastic volatility into the VAR model with time-varying coefficients. 3 Primiceri (25) proposes the TVP-VAR model which allows all parameters (β, a, h) varying over time, and estimate a three-variable VAR model (the same variables as Cogley 2 Cogley and Sargent (25) state if the world were characterized by constant θ [coefficients of the VAR] and drifting R [variance of the VAR], and we fit an approximating model with constant R and drifting θ, then it seems likely that our estimates of θ would drift to compensate for misspecification of R, thus exaggerating the time variation in θ. 3 Uhlig (997) originally developed the VAR model with stochastic volatility 24

and Sargent (2)) for the U.S. data. 4 The empirical results reveal that the responses of the policy interest rates to inflation and unemployment exhibit a trend toward more aggressive behavior in recent decades, and it has a negligible effect on the rest of the economy. After Primiceri (25) s introduction of the TVP-VAR model, several papers have analyzed time-varying structure of the macroeconomy in specific ways. Benati and Mumtaz (25) estimate the TVP-VAR model for the U.K. data by imposing sign restrictions on the impulse responses to assess the source of Great Stability in the U.K. as well as uncertainty for inflation forecasting (see also Benati (28)). Baumeister et al. (28) estimate the TVP-VAR model for the Euro area data to assess the effects of excess liquidity shocks on macroeconomic variables. D Agostino et al. (28) examine the forecasting performance of the TVP-VAR model over other standard VAR models. Nakajima et al. (29, 2) estimate the TVP-VAR model for the Japanese macroeconomic data. An increasing number of studies have examined the TVP-VAR models to provide empirical evidence of the dynamic structure of the economy (see e.g., Benati and Surico (28), Mumtaz and Surico (29), Baumeister and Benati (2), Clark and Terry (2)). Given such previous literature, we will show an empirical application of the TVP-VAR model to Japanese data, with emphasis on the role of stochastic volatility in the estimation. 5 Empirical results for the Japanese economy As mentioned above, this section applies the TVP-VAR model, developed so far, to Japanese macroeconomic variables, with emphasis on the role of stochastic volatility in the estimation. 5 5. Data and settings A three-variable TVP-VAR model is estimated for quarterly data from the period 977/Q to 27/4Q, thereby examining the time-varying nature of macroeconomic dynamics over the three decades of the sample period. To that end, two sets of variables are examined: (i) (p, x, b) and (ii) (p, x, i), where p is the inflation rate; x is the output; b is the medium-term interest rates; and i is the short-term interest rates. 6 4 In Cogley and Sargent (25), the simultaneous relations, a, of the structural shock remains assumed timeinvariant. 5 Similar studies for Japanese macroeconomic data are analyzed by Nakajima et al. (29, 2). See the previous section for literature on the empirical studies of the TVP-VAR models using other countries data. 6 The inflation rate is taken from the CPI (consumer price index, general excluding fresh food, log-difference, the effects of the increase in the consumption tax removed, and seasonally adjusted). The output gap is a series of deviations of GDP from its potential level, calculated by the Bank of Japan. The medium-term bond 25