Financial Econometrics (FinMetrics04) Time-series Statistics Concepts Exploratory Data Analysis Testing for Normality Empirical VaR

Financial Econometrics (FinMetrics04) Time-series Statistics Concepts Exploratory Data Analysis Testing for Normality Empirical VaR Nelson Mark University of Notre Dame Fall 2017 September 11, 2017 Introduction The parametric VaR analysis assumed returns are normally distributed, and we estimated mean and standard deviations from a time series (the returns). In this segment, we first probe a bit deeper into the statistical meaning of these estimated parameters. Next, we cover some basic procedures in exploratory data analysis. Third, we evaluate whether the normality assumption is a good one or not. Concepts covered in this section 1. What is the meaning of a time series? What are time-series moments? (note: the k th moment of a random variable X (or the k th moment of the distribution of X is E ( X k) ) 2. Relation between sample moments and theoretical moments 3. Data generating mechanisms 1

4. Stationary and nonstationary distributions (stationary and nonstationary time-series) 5. Ergodicity 6. Testing methodology 7. Exploratory data analysis 8. Tests for normality Properties of Normal Distribution We want to determine if normality is a good model. To do so, we should be familiar with its properties. Let s quickly review some properties of the normal. Standard normal Normal f (x) = 1 2π e x2 (1) f (x) = 1 σ x µ e ( σ ) 2 (2) 2π 1. Distribution is symmetric around µ (mean, location) 2. Dispersion regulated by σ (scale). The standard deviation is used as measure of volatility How is it a measure of scale? If X is household income in dollars, then 100X is household income in cents. Var (100X) = 10 Var (X) (3) 3. Tail probabilities converge to 0 at a well defined rate. Loosely speaking normal tail probabilities converge to 0 quickly (even though it s possible to have realizations that are arbitraily large or small). Conclusion: Assessments of normality involve checking for distributional symmetry and appropriate tail thickness. How do we do that? Through examination of sample moments. 2

Sample and Theoretical Moments The k-th theoretical moment of a distribution (the k-th theoretical moment of a random variable X) is E ( X k) (4) The k-th central moment is where µ is the first moment, µ = E (X). E (X µ) k (5) Sample moments are the sample counterparts. Let {x t } T t=1 time-series observations (e.g., returns). be a sequence of Some of these moments have names. Mean and variance mean: µ = E (X t ) (6) Sample mean: XT = 1 T x t T (7) t=1 variance:e (X t µ) 2 (8) Sample variance: ˆσ T 2 = 1 T ( xt T 1 ) 2 T (9) Third moments gets at symmetry, asymmetry. The theoretical measure is skewness is, The sample counterpart is t=1 Theoretical skewness: E (X t µ) 3 Sample skewness: sk T = σ 3 (10) 1 T ( T 1 t=1 xt X ) T ˆσ T 3 (11) The skewness measure is zero for the normal distribution. It s zero for all symmetric distributions. 3

Figure 1: Distributions with differing kurtosis The fourth moment gets at tail thickness. The theoretical measure is kurtosis Theoretical kurtosis: E (X t µ) 4 (12) σ 4 ( xt X ) 4 T Sample kurtosis: kurt T = 1 T 1 T t=1 ˆσ 4 T (13) Kurtosis for the normal is 3. A distribution has excess kurtosis if the measure exceeds 3. These are fat-tailed distributions. There is a higher probability of extreme events than predicted by the normal. In applications, pay attention to whether the software computes kurtosis or excess kurtosis. Excess kurtosis subtracts 3 from the kurtosis measure. (Matlab computes kurtosis.) Convergence concepts We want to compare the sample moments to the theoretical moments. In order for this to make sense, the sample moments need to converge to the theoretical moments. 4

Concept of theoretical moments for a time series is for a hypothetical cross-sectional distribution at a particular time t. Imagine running the process (i.e., running the world) over and over a bunch of times. Figure 2: Time-series concepts But the sample moments are computed in the time dimension, for a single realization of the process. There s a disconnect. They are brought together by two concepts. 1. Stationarity. Strict stationarity says the distribution of Xt is the same for all t. So the distribution of Xt is the same as for Xt+1, etc. A less restrictive form is covariance stationarity. This says the covariance between Xt and Xt s is the same, for all t. 2. Ergodicity. A time-series is ergodic if the sample moment converges to the theoretical cross-sectional moment, as T. If a time-series is stationary and ergodic, then the sample moments (computed from a single realization of the time series) converges to the theoretical moments as the sample size T goes to infinity. Wait! If a time-series is strictly stationary, won t it be ergodic? NO! Here s an example of a stationary sequence that is not ergodic. We have 2 coins. One is fair. The other has heads on both sides. We flip the coins. Heads=1, Tails = 0. 5

Begin by flipping the fair coin once. If you get heads, you will then generate a subsequent sequence of flips with the fair coin, but if it s tails, you ll generate a subsequent sequence with the two-headed coin. The expected value of any observation is 1 2 (1) + 1 = 3 4 The sample mean will either be 1 or 1, neither of which will ever converge to 3. 2 4 Statistical Testing Methodology The classical hypothesis testing methodology is due to R.A. Fisher. 1. Assume the null hypothesis is true. (e.g., β = 0) 2. Determine the sampling distribution of your test statistic under the null hypothesis. (e.g., the t-statistic, follows a student-t for small samples, and N (0, 1) for larger samples). 3. Ask if the observed test statistic, computed using data, could reasonably be drawn from the null distribution. If answer is yes, data are consistent with the null. You cannot reject the null hypothesis If answer is no, then you can reject the null. 6

Figure 3: Testing a hypothesis Popular tests for normality 1. Jarque-Bera test. The Jarque-Bera statistic measures the difference between skewness and kurtosis in the data and the normal distribution. (kurtt 3)2 skt2 + 4 T Jarque-Bera = 6! (14) where skt is sample skewness, and kurtt is sample kurtosis. If the data are drawn from the normal, the statistic has a χ22 distribution. 2. Kolmogorov-Smirnov test. This measures the deviation between the empirical CDF and the normal CDF. 7

Figure 4: KS test Let F (r) be the normal CDF of returns, r and FT (r) be the empirical CDF. then the Kolomogorov-Smirnov statistic is KS = sup FT (r) F (r) (15) r The KS statistic has an asymptotic distribution, so we can use it to test the null hypothesis that the data are normal. (Details omitted). Quantile-based moment estimators If the distribution is ill-behaved (i.e., skewed, fat-tailed), sample moments may not provide the right information. Quantile-based moment estimators provide a robust alternative. 1. Location: Median. Rank the observations from low to high. The median is a number where half the observations lie above, and half lie below. The median is not necessarily unique. 2. Dispersion: Interquartile range. Matlab command: iqr(x), where X is a vector of observations. Example: Q1 Q2 Q3 Q4 z } {z } { z } { z } { 2, 4, 5,7, 8, {z} 9, 10, 12, 15, 16, 18 median 8 (16)

m = 9 (17) Q 1 = 5 (18) Q 2 = 9 (19) Q 3 = 15 (20) Q 4 = 18 (21) IQR = Q 3 Q 1 = 10 (22) 3. FYI, but we re not going to use them there are also quantile-based measures of skewness (e.g., the Bowley measure) and kurtosis. Exploratory Data Analysis and Testing for Normality 1. Plot the price data, and stare at it Are there outliers? If so, double check for data entry errors. Is there evidence of a structural break, or regime change? Split adjustment problems, etc. Volatility clustering? 2. Repeat for returns 3. Generate a histogram of returns. This is a blunt estimator of the empirical distribution. Determining the optimal bin size is not trivial. Overlay with the normal distribution. 4. Kernel density estimate. Fits a curve to the histogram. 5. Empirical CDF (cumulative distribution function) plots. The CDF is F (x) = x f (z) dz = Pr (z x) (23) As a visual aid, it s easier to see deviations from the normal with CDF plots than the kernel density. 9

Figure 5: Density and Cumulative Density 6. QQ plots (Quantile-quantile plots). Displays a quantile-quantile plot of the sample quantiles of returns versus theoretical quantiles from a normal distribution. If the distribution of the observations is normal, the plot will be close to linear 10

Figure 6: QQ plot Apply these concepts. Matlab code TestForNormality.m Dealing with non-normality 1. We concluded that returns are not normal. They are not skewed, but are fat-tailed (leptokurtotic). This is a problem, because the normal will understate the probability of extreme events (like a crash). 2. What to do? Assume a fat-tailed distribution. Student-t The stable distributions. (The Levy alpha-stable distributions) Create a mixture of normals. Set σ2 > σ > σ1, where σ is the 11

estimated standard deviation, and simulate assume returns follow { µr + σ ˆr t = 1 z t w.p. 0.5 µ r + σ 2 z t w.p. 0.5 where z t is an i.i.d. standard normal. We will do none of these. Instead, we will do nonparametric Value at Risk. Instead of assuming a parametric distribution, we will use the empirical distribution. Empirical VaR Using the CMG returns data, let us simply count events. Matlab code Empirical VaR01.m 1. Count the frequency of daily returns that are less than -0.10 to find the empirical Pr (r t 0.10) 2. Find the return at the 5% quantile. 12