Financial Econometrics (FinMetrics04) Time-series Statistics Concepts Exploratory Data Analysis Testing for Normality Empirical VaR

Similar documents
Lecture 6: Non Normal Distributions

Financial Time Series and Their Characteristics

Statistical Analysis of Data from the Stock Markets. UiO-STK4510 Autumn 2015

Introduction to Computational Finance and Financial Econometrics Descriptive Statistics

Data Distributions and Normality

2018 AAPM: Normal and non normal distributions: Why understanding distributions are important when designing experiments and analyzing data

**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:

A Robust Test for Normality

Contents Part I Descriptive Statistics 1 Introduction and Framework Population, Sample, and Observations Variables Quali

Review: Population, sample, and sampling distributions

Financial Econometrics Notes. Kevin Sheppard University of Oxford

One sample z-test and t-test

Subject CS1 Actuarial Statistics 1 Core Principles. Syllabus. for the 2019 exams. 1 June 2018

Lecture 3: Probability Distributions (cont d)

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1

Case Study: Heavy-Tailed Distribution and Reinsurance Rate-making

Some Characteristics of Data

Data Analysis and Statistical Methods Statistics 651

Value at Risk with Stable Distributions

Data Analysis and Statistical Methods Statistics 651

Hypothesis Tests: One Sample Mean Cal State Northridge Ψ320 Andrew Ainsworth PhD

Fin285a:Computer Simulations and Risk Assessment Section 3.2 Stylized facts of financial data Danielson,

Financial Econometrics Jeffrey R. Russell. Midterm 2014 Suggested Solutions. TA: B. B. Deng

Web Science & Technologies University of Koblenz Landau, Germany. Lecture Data Science. Statistics and Probabilities JProf. Dr.

VOLATILITY. Time Varying Volatility

Basic Procedure for Histograms

Measuring Financial Risk using Extreme Value Theory: evidence from Pakistan

A Regime Switching model

Noureddine Kouaissah, Sergio Ortobelli, Tomas Tichy University of Bergamo, Italy and VŠB-Technical University of Ostrava, Czech Republic

KARACHI UNIVERSITY BUSINESS SCHOOL UNIVERSITY OF KARACHI BS (BBA) VI

DATA SUMMARIZATION AND VISUALIZATION

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

The normal distribution is a theoretical model derived mathematically and not empirically.

An Information Based Methodology for the Change Point Problem Under the Non-central Skew t Distribution with Applications.

1 Exercise One. 1.1 Calculate the mean ROI. Note that the data is not grouped! Below you find the raw data in tabular form:

Random Variables and Probability Distributions

2 Exploring Univariate Data

Modeling Obesity and S&P500 Using Normal Inverse Gaussian

Prof. Thistleton MAT 505 Introduction to Probability Lecture 3

Master s in Financial Engineering Foundations of Buy-Side Finance: Quantitative Risk and Portfolio Management. > Teaching > Courses

Fat Tailed Distributions For Cost And Schedule Risks. presented by:

The Two-Sample Independent Sample t Test

Market Risk Analysis Volume IV. Value-at-Risk Models

Forecasting Volatility of USD/MUR Exchange Rate using a GARCH (1,1) model with GED and Student s-t errors

Unit 5: Sampling Distributions of Statistics

Unit 5: Sampling Distributions of Statistics

Modeling the Conditional Distribution: More GARCH and Extreme Value Theory

The Binomial Distribution

2.4 STATISTICAL FOUNDATIONS

Amath 546/Econ 589 Univariate GARCH Models: Advanced Topics

Introduction to Algorithmic Trading Strategies Lecture 8

Lecture 9: Markov and Regime

Lecture 8: Markov and Regime

Week 1 Quantitative Analysis of Financial Markets Distributions B

Business Statistics 41000: Probability 3

Key Objectives. Module 2: The Logic of Statistical Inference. Z-scores. SGSB Workshop: Using Statistical Data to Make Decisions

Introduction to Statistical Data Analysis II

Linda Allen, Jacob Boudoukh and Anthony Saunders, Understanding Market, Credit and Operational Risk: The Value at Risk Approach

Empirical Rule (P148)

Normal distribution Approximating binomial distribution by normal 2.10 Central Limit Theorem

Simple Descriptive Statistics

Lecture 2 Describing Data

Asymmetric Price Transmission: A Copula Approach

Contents. The Binomial Distribution. The Binomial Distribution The Normal Approximation to the Binomial Left hander example

Section Sampling Distributions for Counts and Proportions

On Some Test Statistics for Testing the Population Skewness and Kurtosis: An Empirical Study

Mongolia s TOP-20 Index Risk Analysis, Pt. 3

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment

Frequency Distribution and Summary Statistics

Valuing Investments A Statistical Perspective. Bob Stine Department of Statistics Wharton, University of Pennsylvania

SYSM 6304 Risk and Decision Analysis Lecture 2: Fitting Distributions to Data

I. Time Series and Stochastic Processes

Chapter 9: Sampling Distributions

Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution

Financial Econometrics Jeffrey R. Russell Midterm 2014

Model Construction & Forecast Based Portfolio Allocation:

Lecture Data Science

2011 Pearson Education, Inc

Exam 2 Spring 2015 Statistics for Applications 4/9/2015

Theoretical Foundations

A continuous random variable is one that can theoretically take on any value on some line interval. We use f ( x)

The Binomial Distribution

1. You are given the following information about a stationary AR(2) model:

Continuous Distributions

Midterm Exam. b. What are the continuously compounded returns for the two stocks?

ECON Introductory Econometrics. Lecture 1: Introduction and Review of Statistics

Section 6-1 : Numerical Summaries

Homework Problems Stat 479

Topic 8: Model Diagnostics

Internet Appendix for Asymmetry in Stock Comovements: An Entropy Approach

Gamma Distribution Fitting

Statistics & Flood Frequency Chapter 3. Dr. Philip B. Bedient

STAT 157 HW1 Solutions

SOLUTIONS TO THE LAB 1 ASSIGNMENT

EMPIRICAL DISTRIBUTIONS OF STOCK RETURNS: SCANDINAVIAN SECURITIES MARKETS, Felipe Aparicio and Javier Estrada * **

MEASURES OF DISPERSION, RELATIVE STANDING AND SHAPE. Dr. Bijaya Bhusan Nanda,

An Empirical Research on Chinese Stock Market Volatility Based. on Garch

Financial Risk Forecasting Chapter 1 Financial markets, prices and risk

A market risk model for asymmetric distributed series of return

FINITE SAMPLE DISTRIBUTIONS OF RISK-RETURN RATIOS

Transcription:

Financial Econometrics (FinMetrics04) Time-series Statistics Concepts Exploratory Data Analysis Testing for Normality Empirical VaR Nelson Mark University of Notre Dame Fall 2017 September 11, 2017 Introduction The parametric VaR analysis assumed returns are normally distributed, and we estimated mean and standard deviations from a time series (the returns). In this segment, we first probe a bit deeper into the statistical meaning of these estimated parameters. Next, we cover some basic procedures in exploratory data analysis. Third, we evaluate whether the normality assumption is a good one or not. Concepts covered in this section 1. What is the meaning of a time series? What are time-series moments? (note: the k th moment of a random variable X (or the k th moment of the distribution of X is E ( X k) ) 2. Relation between sample moments and theoretical moments 3. Data generating mechanisms 1

4. Stationary and nonstationary distributions (stationary and nonstationary time-series) 5. Ergodicity 6. Testing methodology 7. Exploratory data analysis 8. Tests for normality Properties of Normal Distribution We want to determine if normality is a good model. To do so, we should be familiar with its properties. Let s quickly review some properties of the normal. Standard normal Normal f (x) = 1 2π e x2 (1) f (x) = 1 σ x µ e ( σ ) 2 (2) 2π 1. Distribution is symmetric around µ (mean, location) 2. Dispersion regulated by σ (scale). The standard deviation is used as measure of volatility How is it a measure of scale? If X is household income in dollars, then 100X is household income in cents. Var (100X) = 10 Var (X) (3) 3. Tail probabilities converge to 0 at a well defined rate. Loosely speaking normal tail probabilities converge to 0 quickly (even though it s possible to have realizations that are arbitraily large or small). Conclusion: Assessments of normality involve checking for distributional symmetry and appropriate tail thickness. How do we do that? Through examination of sample moments. 2

Sample and Theoretical Moments The k-th theoretical moment of a distribution (the k-th theoretical moment of a random variable X) is E ( X k) (4) The k-th central moment is where µ is the first moment, µ = E (X). E (X µ) k (5) Sample moments are the sample counterparts. Let {x t } T t=1 time-series observations (e.g., returns). be a sequence of Some of these moments have names. Mean and variance mean: µ = E (X t ) (6) Sample mean: XT = 1 T x t T (7) t=1 variance:e (X t µ) 2 (8) Sample variance: ˆσ T 2 = 1 T ( xt T 1 ) 2 T (9) Third moments gets at symmetry, asymmetry. The theoretical measure is skewness is, The sample counterpart is t=1 Theoretical skewness: E (X t µ) 3 Sample skewness: sk T = σ 3 (10) 1 T ( T 1 t=1 xt X ) T ˆσ T 3 (11) The skewness measure is zero for the normal distribution. It s zero for all symmetric distributions. 3

Figure 1: Distributions with differing kurtosis The fourth moment gets at tail thickness. The theoretical measure is kurtosis Theoretical kurtosis: E (X t µ) 4 (12) σ 4 ( xt X ) 4 T Sample kurtosis: kurt T = 1 T 1 T t=1 ˆσ 4 T (13) Kurtosis for the normal is 3. A distribution has excess kurtosis if the measure exceeds 3. These are fat-tailed distributions. There is a higher probability of extreme events than predicted by the normal. In applications, pay attention to whether the software computes kurtosis or excess kurtosis. Excess kurtosis subtracts 3 from the kurtosis measure. (Matlab computes kurtosis.) Convergence concepts We want to compare the sample moments to the theoretical moments. In order for this to make sense, the sample moments need to converge to the theoretical moments. 4

Concept of theoretical moments for a time series is for a hypothetical cross-sectional distribution at a particular time t. Imagine running the process (i.e., running the world) over and over a bunch of times. Figure 2: Time-series concepts But the sample moments are computed in the time dimension, for a single realization of the process. There s a disconnect. They are brought together by two concepts. 1. Stationarity. Strict stationarity says the distribution of Xt is the same for all t. So the distribution of Xt is the same as for Xt+1, etc. A less restrictive form is covariance stationarity. This says the covariance between Xt and Xt s is the same, for all t. 2. Ergodicity. A time-series is ergodic if the sample moment converges to the theoretical cross-sectional moment, as T. If a time-series is stationary and ergodic, then the sample moments (computed from a single realization of the time series) converges to the theoretical moments as the sample size T goes to infinity. Wait! If a time-series is strictly stationary, won t it be ergodic? NO! Here s an example of a stationary sequence that is not ergodic. We have 2 coins. One is fair. The other has heads on both sides. We flip the coins. Heads=1, Tails = 0. 5

Begin by flipping the fair coin once. If you get heads, you will then generate a subsequent sequence of flips with the fair coin, but if it s tails, you ll generate a subsequent sequence with the two-headed coin. The expected value of any observation is 1 2 (1) + 1 = 3 4 The sample mean will either be 1 or 1, neither of which will ever converge to 3. 2 4 Statistical Testing Methodology The classical hypothesis testing methodology is due to R.A. Fisher. 1. Assume the null hypothesis is true. (e.g., β = 0) 2. Determine the sampling distribution of your test statistic under the null hypothesis. (e.g., the t-statistic, follows a student-t for small samples, and N (0, 1) for larger samples). 3. Ask if the observed test statistic, computed using data, could reasonably be drawn from the null distribution. If answer is yes, data are consistent with the null. You cannot reject the null hypothesis If answer is no, then you can reject the null. 6

Figure 3: Testing a hypothesis Popular tests for normality 1. Jarque-Bera test. The Jarque-Bera statistic measures the difference between skewness and kurtosis in the data and the normal distribution. (kurtt 3)2 skt2 + 4 T Jarque-Bera = 6! (14) where skt is sample skewness, and kurtt is sample kurtosis. If the data are drawn from the normal, the statistic has a χ22 distribution. 2. Kolmogorov-Smirnov test. This measures the deviation between the empirical CDF and the normal CDF. 7

Figure 4: KS test Let F (r) be the normal CDF of returns, r and FT (r) be the empirical CDF. then the Kolomogorov-Smirnov statistic is KS = sup FT (r) F (r) (15) r The KS statistic has an asymptotic distribution, so we can use it to test the null hypothesis that the data are normal. (Details omitted). Quantile-based moment estimators If the distribution is ill-behaved (i.e., skewed, fat-tailed), sample moments may not provide the right information. Quantile-based moment estimators provide a robust alternative. 1. Location: Median. Rank the observations from low to high. The median is a number where half the observations lie above, and half lie below. The median is not necessarily unique. 2. Dispersion: Interquartile range. Matlab command: iqr(x), where X is a vector of observations. Example: Q1 Q2 Q3 Q4 z } {z } { z } { z } { 2, 4, 5,7, 8, {z} 9, 10, 12, 15, 16, 18 median 8 (16)

m = 9 (17) Q 1 = 5 (18) Q 2 = 9 (19) Q 3 = 15 (20) Q 4 = 18 (21) IQR = Q 3 Q 1 = 10 (22) 3. FYI, but we re not going to use them there are also quantile-based measures of skewness (e.g., the Bowley measure) and kurtosis. Exploratory Data Analysis and Testing for Normality 1. Plot the price data, and stare at it Are there outliers? If so, double check for data entry errors. Is there evidence of a structural break, or regime change? Split adjustment problems, etc. Volatility clustering? 2. Repeat for returns 3. Generate a histogram of returns. This is a blunt estimator of the empirical distribution. Determining the optimal bin size is not trivial. Overlay with the normal distribution. 4. Kernel density estimate. Fits a curve to the histogram. 5. Empirical CDF (cumulative distribution function) plots. The CDF is F (x) = x f (z) dz = Pr (z x) (23) As a visual aid, it s easier to see deviations from the normal with CDF plots than the kernel density. 9

Figure 5: Density and Cumulative Density 6. QQ plots (Quantile-quantile plots). Displays a quantile-quantile plot of the sample quantiles of returns versus theoretical quantiles from a normal distribution. If the distribution of the observations is normal, the plot will be close to linear 10

Figure 6: QQ plot Apply these concepts. Matlab code TestForNormality.m Dealing with non-normality 1. We concluded that returns are not normal. They are not skewed, but are fat-tailed (leptokurtotic). This is a problem, because the normal will understate the probability of extreme events (like a crash). 2. What to do? Assume a fat-tailed distribution. Student-t The stable distributions. (The Levy alpha-stable distributions) Create a mixture of normals. Set σ2 > σ > σ1, where σ is the 11

estimated standard deviation, and simulate assume returns follow { µr + σ ˆr t = 1 z t w.p. 0.5 µ r + σ 2 z t w.p. 0.5 where z t is an i.i.d. standard normal. We will do none of these. Instead, we will do nonparametric Value at Risk. Instead of assuming a parametric distribution, we will use the empirical distribution. Empirical VaR Using the CMG returns data, let us simply count events. Matlab code Empirical VaR01.m 1. Count the frequency of daily returns that are less than -0.10 to find the empirical Pr (r t 0.10) 2. Find the return at the 5% quantile. 12