Chapter 5 Univariate time-series analysis. () Chapter 5 Univariate time-series analysis 1 / 29

Chapter 5 Univariate time-series analysis () Chapter 5 Univariate time-series analysis 1 / 29

Time-Series Time-series is a sequence fx 1, x 2,..., x T g or fx t g, t = 1,..., T, where t is an index denoting the period in time in which x occurs. We shall treat x t as a random variable; hence, a time-series is a sequence of random variables ordered in time. Such a sequence is known as a stochastic process. The probability structure of a sequence of random variables is determined by the joint distribution of a stochastic process. The simplest possible probability model for such a joint distribution is: x t = α + ɛ t, ɛ t n.i.d. 0, σ 2 ɛ i.e., x t is normally independently distributed over time with constant variance and mean equal to α. In other words, x t is the sum of a constant and a white-noise process. If a white-noise process were a proper model for financial time-series, forecasting would not be very interesting as the best forecast for the moments of the relevant time series would be their unconditional moments. () Chapter 5 Univariate time-series analysis 2 / 29,

Better models The model: x t = α + ɛ t, ɛ t n.i.d. 0, σ 2 ɛ, ˆ α = 1 T ˆ 2 T 1 T x t, σɛ = x t ˆα T t=1 t=1 2 Reflect the traditional approach to portfolio allocation, but it does not reflect the data. At high frequency the variance is not constant and predictable, at low frequency returns are persistent and predictable. () Chapter 5 Univariate time-series analysis 3 / 29

Better models.2.1.0.1.2.3 1980 1985 1990 1995 2000 2005 2010 US Stock Returns 1m sim wn.20.16.12.08.04.00.04 1980 1985 1990 1995 2000 2005 2010 US Stock Returns 1m sim wn While the CER gives a plausible representation for the 1-month returns, the behaviour over time of the YTM of the 10-Year bond does not resemble at all that of the simulated data. () Chapter 5 Univariate time-series analysis 4 / 29

ARMA modelling A more general and more flexible class of models emerges when combinations of ɛ t are used to model x t. We concentrate on a class of models created by taking linear combinations of the white noise, the autoregressive moving average (ARMA) models: AR(1) : x t = ρx t 1 + ɛ t, MA(1) : x t = ɛ t + θɛ t 1, AR(p) : x t = ρ 1 x t 1 + ρ 2 x t 2 +... + ρ p x t p + ɛ t, MA(q) : x t = ɛ t + θ 1 ɛ t 1 +... + θ q ɛ t q, ARMA(p, q) : x t = ρ 1 x t 1 +... + ρ p x t p + θ 1 ɛ t 1 +... + θ q ɛ t q. () Chapter 5 Univariate time-series analysis 5 / 29

Analysing time-series models To illustrate empirically all fundamentals we consider a specific member of the ARMA family, the AR model with drift, x t = ρ 0 + ρ 1 x t 1 + ɛ t, (1) ɛ t n.i.d. 0, σ 2 ɛ. Given that each realization of our stochastic process is a random variable, the first relevant fundamental is the density of each observation. In particular, we distinguish between conditional and unconditional densities. () Chapter 5 Univariate time-series analysis 6 / 29

Conditional and Unconditional Densities The unconditional density is obtained under the hypothesis that no observation on the time-series is available, while conditional densities are based on the observation of some realization of random variables. In the case of time-series, we derive unconditional density by putting ourselves at the moment preceding the observation of any realization of the time-series. At that moment the information set contains only the knowledge of the process generating the observations. As observations become available, we can compute conditional densities. () Chapter 5 Univariate time-series analysis 7 / 29

Conditional Densities Consider the AR(1) model. The moments of the density of x t conditional upon x t 1 are immediately obtained from the relevant process: E (x t j x t 1 ) = ρ 0 + ρ 1 x t 1, Var (x t j x t 1 ) = σ 2 ɛ, Cov (x t j x t 1 ), x t j j x t j 1 = 0 for each j. To derive the moments of the density of x t conditional upon x t need to substitute x t 2 from (1) for x t 1 : 2, we E (x t j x t 2 ) = ρ 0 + ρ 0 ρ 1 + ρ 2 1 x t 2, Var (x t j x t 2 ) = σ 2 ɛ 1 + ρ 2 1, Cov (x t j x t 2 ), x t j j x t j 2 = ρ 1 σ 2 ɛ, for j = 1, Cov (x t j x t 2 ), x t j j x t j 2 = 0, for j > 1. () Chapter 5 Univariate time-series analysis 8 / 29

Unconditional Densities Unconditional moments are derived by substituting recursively from to express x t as a function of information available at time t 0, the moment before we start observing realizations of our process. E (x t ) = ρ 0 1 + ρ 1 + ρ 2 1 Var (x t ) = σ 2 ɛ 1 + ρ 2 1 + ρ4 1 1 +... + ρt 1 +... + ρ2t 2 1 γ (j) = Cov x t, x t j = ρ j 1 Var (x t), ρ (j) = + ρ t 1 x 0,, Cov x t, x t p Var (xt ) Var (x t 1 ) = ρ j 1 Var (x t) p Var (xt ) Var (x t 1 ). j Note that γ (j) and ρ (j) are functions of j, known respectively as the autocovariance function and the autocorrelation function. () Chapter 5 Univariate time-series analysis 9 / 29

Stationarity A stochastic process is strictly stationary if its joint density function does not depend on time. More formally, a stochastic process is stationary if, for each j 1, j 2,..., j n, the joint distribution, f x t, x t+j1, x t+j2, x t+jn, does not depend on t. A stochastic process is covariance stationary if its two first unconditional moments do not depend on time, i.e. if the following relations are satisfied for each h, i, j: E (x t ) = E (x t+h ) = µ, E x 2 t = E x 2 t+h = µ 2, E x t+i x t+j = µ ij. () Chapter 5 Univariate time-series analysis 10 / 29

Stationarity In the case of our AR(1) process, the condition for stationarity is jρ 1 j < 1. When such a condition is satisfied, we have: Cov x t, x t E (x t ) = E (x t+h ) = ρ 0 1 ρ 1, Var (x t ) = Var (x t+h ) = σ2 ɛ 1 ρ 2, 1 j = ρ j 1 Var (x t). On the other hand, when jρ 1 j = 1, the process is obviously non-stationary: Cov x t, x t E (x t ) = ρ 0 t + x 0, Var (x t ) = σ 2 ɛt, j = σ 2 ɛ (t j). () Chapter 5 Univariate time-series analysis 11 / 29

General ARMA processes The Wold decomposition theorem warrants that any stationary stochastic process can be expressed as the sum of a deterministic and a stochastic moving-average component: x t = ɛ t + b 1 ɛ t 1 + b 2 ɛ t 2 +... + b n ɛ t n = 1 + b 1 L + b 2 L 2 +... + b n L n ɛ t = b(l)ɛ t, Represent the polynomial b(l) as the ratio of two polynomials of lower order: x t = b (L) ɛ t = a (L) c (L) ɛ t, c (L) x t = a (L) ɛ t. (2) This is an ARMA process. Stationary requires that the roots of c (L) lie outside the unit circle. Invertibility of the MA component require that the roots of a (L) lie outside the unit circle. () Chapter 5 Univariate time-series analysis 12 / 29

General ARMA processes Consider the simplest case, the ARMA(1,1) process: x t = c 1 x t 1 + ɛ t + a 1 ɛ t 1, (1 c 1 L) x t = (1 + a 1 L) ɛ t. The above equation is equivalent to: x t = 1 + a 1L 1 c 1 L ɛ t = (1 + a 1 L) = 1 + c 1 L + (c 1 L) 2 +... ɛ t h 1 + (a 1 + c 1 ) L + c 1 (a 1 + c 1 ) L 2 + c 2 1 (a 1 + c 1 ) L 3 +... i ɛ t. Which shows that the ratio of two finite lag polynomials allows us to model an infinite lag polynomial. () Chapter 5 Univariate time-series analysis 13 / 29

General ARMA processes We then have, Var (x t ) = = Cov (x t, x t 1 ) = = Hence, h i 1 + (a 1 + c 1 ) 2 + c 2 1 (a 1 + c 1 ) 2 +... σ 2 ɛ " # 1 + (a 1 + c 1 ) 2 1 c 2 σ 2 ɛ, 1 h i (a 1 + c 1 ) + c 1 (a 1 + c 1 ) + c 2 1 (a 1 + c 1 ) +... " # (a 1 + c 1 ) + c 1 (a 1 + c 1 ) 2 1 c 2 σ 2 ɛ. 1 ρ (1) = Cov (x t,x t 1 ) Var (x t ) = (1 + a 1c 1 ) (a 1 + c 1 ) 1 + c 2 1 + 2a 1c 1. σ 2 ɛ () Chapter 5 Univariate time-series analysis 14 / 29

General ARMA processes For example, suppose c (L) x t = a (L) ɛ t and you want to find x t = d (L) ɛ t. Parameters in d (L) are most easily found by writing c (L) d (L) = a (L) and by matching terms in L j. For an illustration suppose a (L) = 1 + a 1 L, c (L) = 1 + c 1 L. Multiplying out d (L) we have (1 + c 1 L) 1 + d 1 L + d 2 L 2 +...d n L n = 1 + a 1 L Matching powers of L, d 1 = a 1 c 1 c 1 d 1 + d 2 = 0 c 1 d 2 + d 3 = 0 c 1 d n 1 + d n = 0 x t = ɛ t + (a 1 c 1 ) ɛ t 1 c 1 (a 1 c 1 ) ɛ t 2 +... ( c 1 ) n 1 (a 1 c 1 ) ɛ t n () Chapter 5 Univariate time-series analysis 15 / 29

Insert Clicker 3 here () Chapter 5 Univariate time-series analysis 16 / 29

Persistence and the linear model Persistence of time-series destroys one of the crucial properties for implementing valid estimation and inference in the linear model. In the context of the linear model y = Xβ + ɛ. The following property is required to implement valid estimation and inference E (ɛ j X) = 0. (3) Hypothesis (3) implies that E (ɛ i j x 1,...x i,..., x n ) = 0, (i = 1,..., n). Think of the simplest time-series model for a generic variable y: y t = a 0 + a 1 y t 1 + ɛ t. Clearly, if a 1 6= 0, then, although it is still true that E (ɛ t j y t 1 ) = 0, E (ɛ t 1 j y t 1 ) 6= 0 and (3) breaks down. () Chapter 5 Univariate time-series analysis 17 / 29

How serious is the problem? To assess intuitively the consequences of persistence, we construct a small Monte-Carlo simulation on the short sample properties of the OLS estimator of the parameters in an AR(1) process. A Monte-Carlo simulation is based on the generation of a sample from a known data generating process (DGP). First we generate a set of random numbers from a given distribution (here a normally independent white-noise disturbance) for a sample size of interest (say 200 observations) and then construct the process of interest (in our case, an AR(1) process). When a sample of observations on the process of interest is available, then we can estimate the relevant parameters and compare their fitted values with the known true value. the Monte-Carlo simulation is a sort of controlled experiment. To overcome the potential dependence of the set of random numbers drawn on the sequence of simulated white-noise residuals, the DGP is replicated many times. () Chapter 5 Univariate time-series analysis 18 / 29

From the figure we note that the estimate of a 1 is heavily biased in small samples, but the bias decreases as the sample gets larger, and disappears eventually. One can show analytically that the average of 2 the OLS estimate of a is a 1 () Chapter 5 Univariate. time-series analysis 19 / 29 How serious is the problem? We report the averages across replications in the following figure. 1.0 0.8 0.6 0.4 0.2 0.0 20 40 60 80 100 120 140 160 180 200 A1MEAN TRUEA1 Figure: Small sample bias

The Maximum Likelihood Method The likelihood function is the joint probability distribution of the data, treated as a function of the unknown coefficients The maximum likelihood estimator (MLE) consists of value of the coefficients that maximize the likelihood function The MLE selects the value of parameters to maximize the probability of drawing the data that have been effectively observed () Chapter 5 Univariate time-series analysis 20 / 29

MLE of an MA process Consider an MA process for a return r t+1 : r t+1 = θ 0 + ε t+1 + θ 1 ε t The time series of the residuals can be computed as ε t+1 = r t+1 θ 0 θ 1 ε t ε 0 = 0 If ε t+1 is normally distributed, than we have! 1 f (ε t+1 ) = (2πσ 2 ε ) 1/2 exp ε2 t+1 2σ 2 ε () Chapter 5 Univariate time-series analysis 21 / 29

MLE of an MA process If the ε t+1 are independent over time the likelihood function can be written as follows f (ε 1, ε 2,...ε t+1 ) = T Π i=1 f (ε i ) = Π T 1 exp ε2 i=1 (2πσ 2 1/2 ε ) i 2σ 2 ε! The MLE chooses θ 0, θ 1, σ 2 ε to maximize the probability that the estimated model has generated the observed data-set. The optimum is not always found analically, iterative search is the standard method. () Chapter 5 Univariate time-series analysis 22 / 29

MLE of an AR process Consider a vector x t containing observations on time-series variables at time t. A sample of T time-series observations on all the variables is represented as: 2 3 x 1 X 1. T = 6. 7 4. 5. x T In general, estimation is performed by considering the joint sample density function, known also as the likelihood function, which can be expressed as D X 1 T j X 0, θ. The likelihood function is defined on the parameter space ˆ, given the observation of the observed sample X 1 T and of a set of initial conditions X 0. One can interpret such initial conditions as the pre-sample observations on the relevant variables (which are usually unavailable). () Chapter 5 Univariate time-series analysis 23 / 29

MLE of an AR process In case of independent observations the likelihood function can be written as the product of the density functions for each observation. However, this is not the relevant case for time-series, as time-series observations are in general sequentially correlated. In the case of time-series, the sample density is constructed using the concept of sequential conditioning. The likelihood function, conditioned with respect to initial conditions, can always be written as the product of a marginal density and a conditional density: Obviously, D X 1 T j X 0, θ = D (x 1 j X 0, θ) D X 2 T j X 1, θ. D X 2 T j X 0, θ = D (x 2 j X 1, θ) D X 3 T j X 2, θ, and, by recursive substitution: D X 1 T j X 0, θ = T D (x t j X t 1, θ). t=1 () Chapter 5 Univariate time-series analysis 24 / 29

MLE of an AR process Having obtained D X 1 T j X 0, θ, we can in theory derive D X 1 T, θ by integrating with respect to X 0 the density conditional on pre-sample observations. In practice this could be intractable analytically, as D (X 0 ) is not known. The hypothesis of stationarity becomes crucial at this stage, as stationarity restricts the memory of time-series and limits the effects of pre-sample observations to the first observations in the sample. This is why, in the case of stationary processes, one can simply ignore initial conditions. Clearly, the larger the sample, the better, as the weight of lost information becomes smaller. Moreover, note that even by omitting initial conditions, we have: D X 1 T T j X 0, θ = D (x 1 j X 0, θ) D (x t j X t 1, θ). Therefore, the likelihood function is separated in the product on T 1 conditional distributions and one unconditional distribution. In the case of non-stationarity, the unconditional distribution is undefined. On the other hand, in the case of stationarity, the DGP is completely () Chapter 5 Univariate time-series analysis 25 / 29 t=2

To give more empirical content to our case, let us consider again the case of the univariate first-order autoregressive process, X t j X t 1 N λx t 1, σ 2, (4) D X 1 T j λ, σ 2 = D X 1 j λ, σ 2 T t=2 D X t j X t 1, λ, σ 2. (5) From (5), the likelihood function clearly involves T 1 conditional densities and one unconditional density. The conditional densities are given by (4), the unconditional density can be derived only in the case of stationarity: x t = λx t 1 + u t, u t N.I.D 0, σ 2. () Chapter 5 Univariate time-series analysis 26 / 29

We can obtain by recursive substitution: x t = u t + λu t 1 +... + λ n 1 u 1 + λ n x 0. Only if jλj < 1, the effect of the initial condition disappears and we can write the unconditional density of x t as: D x t j λ, σ 2 σ = 2 N 0, 1 λ 2. Under stationarity we can derive the exact likelihood function: D X 1 T j λ, σ 2 = (2π) T 2 σ T 1 λ 2 1 2 exp " 1 2σ 2 1 λ 2 x 2 1 + T t=2 (x t λx t 1 ) 2!# and estimates of the parameters of interest are derived by maximizing this function. Note that bλ cannot be derived analytically, using the exact likelihood function; but it requires conditioning the likelihood and operating a grid search. () Chapter 5 Univariate time-series analysis 27 / 29,

Putting ARMA models at work There are four main steps in the Box-Jenkins approach: PRE WHITENING: make sure that the time series is stationary.. MODEL SELECTION: Information criteria are a useful tool to this end. The Akaike s information criteria (AIC) and the.schwarz Bayesian Criterion (SBC) are the most commonly used criteria: AIC = 2 log(l) + 2(p + q) SBC = 2 log(l) + log(n)(p + q) MODEL CHECKING: residual tests. Make sure that residuals are not autocorrelated and check whether their distribution is normal, also ex-post evaluation technique based on RMSE and MAE are implemented (Diebold-Mariano, Giacomini-White). FORECASTING, the selected model is typically simulated forward after estimation of the estimation of parameters to produce forecasts for the variable of interests at the relevant horizon. () Chapter 5 Univariate time-series analysis 28 / 29

An Illustration To illustrate how ARMA model can be put at work consider the case of forecasting the returns on portfolio 15 of the 25 FF portfolios. () Chapter 5 Univariate time-series analysis 29 / 29