An Empirical Study of the Behaviour of the Sample Kurtosis in Samples from Symmetric Stable Distributions

Similar documents
Statistics for Economics & Business

18.S096 Problem Set 5 Fall 2013 Volatility Modeling Due Date: 10/29/2013

AY Term 2 Mock Examination

Topic-7. Large Sample Estimation

Introduction to Probability and Statistics Chapter 7

Chapter 8. Confidence Interval Estimation. Copyright 2015, 2012, 2009 Pearson Education, Inc. Chapter 8, Slide 1

5 Statistical Inference

Chapter 8: Estimation of Mean & Proportion. Introduction

SCHOOL OF ACCOUNTING AND BUSINESS BSc. (APPLIED ACCOUNTING) GENERAL / SPECIAL DEGREE PROGRAMME

NOTES ON ESTIMATION AND CONFIDENCE INTERVALS. 1. Estimation

These characteristics are expressed in terms of statistical properties which are estimated from the sample data.

1 Random Variables and Key Statistics

Department of Mathematics, S.R.K.R. Engineering College, Bhimavaram, A.P., India 2

5. Best Unbiased Estimators

. (The calculated sample mean is symbolized by x.)

Inferential Statistics and Probability a Holistic Approach. Inference Process. Inference Process. Chapter 8 Slides. Maurice Geraghty,

Limits of sequences. Contents 1. Introduction 2 2. Some notation for sequences The behaviour of infinite sequences 3

point estimator a random variable (like P or X) whose values are used to estimate a population parameter

Today: Finish Chapter 9 (Sections 9.6 to 9.8 and 9.9 Lesson 3)

Hopscotch and Explicit difference method for solving Black-Scholes PDE

Bayes Estimator for Coefficient of Variation and Inverse Coefficient of Variation for the Normal Distribution

Estimation of Population Variance Utilizing Auxiliary Information

Exam 2. Instructor: Cynthia Rudin TA: Dimitrios Bisias. October 25, 2011

14.30 Introduction to Statistical Methods in Economics Spring 2009

Sampling Distributions and Estimation

CAPITAL ASSET PRICING MODEL

A point estimate is the value of a statistic that estimates the value of a parameter.

A random variable is a variable whose value is a numerical outcome of a random phenomenon.

Control Charts for Mean under Shrinkage Technique

Research Article The Probability That a Measurement Falls within a Range of n Standard Deviations from an Estimate of the Mean

ST 305: Exam 2 Fall 2014

Lecture 4: Parameter Estimation and Confidence Intervals. GENOME 560 Doug Fowler, GS

Lecture 5 Point Es/mator and Sampling Distribu/on

Random Sequences Using the Divisor Pairs Function

Proceedings of the 5th WSEAS Int. Conf. on SIMULATION, MODELING AND OPTIMIZATION, Corfu, Greece, August 17-19, 2005 (pp )

Monetary Economics: Problem Set #5 Solutions

A New Constructive Proof of Graham's Theorem and More New Classes of Functionally Complete Functions

Subject CT1 Financial Mathematics Core Technical Syllabus

Estimating Proportions with Confidence

Maximum Empirical Likelihood Estimation (MELE)

4.5 Generalized likelihood ratio test

Quantitative Analysis

Models of Asset Pricing

Lecture 4: Probability (continued)

DESCRIPTION OF MATHEMATICAL MODELS USED IN RATING ACTIVITIES

Chapter 10 - Lecture 2 The independent two sample t-test and. confidence interval


x satisfying all regularity conditions. Then

FINM6900 Finance Theory How Is Asymmetric Information Reflected in Asset Prices?

BASIC STATISTICS ECOE 1323

Unbiased estimators Estimators

Exam 1 Spring 2015 Statistics for Applications 3/5/2015

Math 124: Lecture for Week 10 of 17

ii. Interval estimation:

Models of Asset Pricing

Models of Asset Pricing

Confidence Intervals based on Absolute Deviation for Population Mean of a Positively Skewed Distribution

Rafa l Kulik and Marc Raimondo. University of Ottawa and University of Sydney. Supplementary material

The Time Value of Money in Financial Management

Journal of Statistical Software

Anomaly Correction by Optimal Trading Frequency

Estimation of Parameters of Three Parameter Esscher Transformed Laplace Distribution

CHAPTER 8: CONFIDENCE INTERVAL ESTIMATES for Means and Proportions

Basic formula for confidence intervals. Formulas for estimating population variance Normal Uniform Proportion

The Valuation of the Catastrophe Equity Puts with Jump Risks

FOUNDATION ACTED COURSE (FAC)

Chapter 8 Interval Estimation. Estimation Concepts. General Form of a Confidence Interval

Statistics for Business and Economics

Subject CT5 Contingencies Core Technical. Syllabus. for the 2011 Examinations. The Faculty of Actuaries and Institute of Actuaries.

CHAPTER 8: CONFIDENCE INTERVAL ESTIMATES for Means and Proportions

CHANGE POINT TREND ANALYSIS OF GNI PER CAPITA IN SELECTED EUROPEAN COUNTRIES AND ISRAEL

Statistical techniques

Estimating the Parameters of the Three-Parameter Lognormal Distribution

An Empirical Study on the Contribution of Foreign Trade to the Economic Growth of Jiangxi Province, China

Monopoly vs. Competition in Light of Extraction Norms. Abstract

ECON 5350 Class Notes Maximum Likelihood Estimation

Sampling Distributions and Estimation

IMPLICATIONS OF A FIRM S MARKET WEIGHT IN A CAPM FRAMEWORK

Pricing 50ETF in the Way of American Options Based on Least Squares Monte Carlo Simulation

Lecture 5: Sampling Distribution

Asymptotics: Consistency and Delta Method

An Improved Estimator of Population Variance using known Coefficient of Variation

Appendix 1 to Chapter 5

CHAPTER 3 RESEARCH METHODOLOGY. Chaigusin (2011) mentioned that stock markets have different

Problem Set 1a - Oligopoly

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

This article is part of a series providing

Optimizing of the Investment Structure of the Telecommunication Sector Company

of Asset Pricing R e = expected return

B = A x z

r i = a i + b i f b i = Cov[r i, f] The only parameters to be estimated for this model are a i 's, b i 's, σe 2 i

Combining imperfect data, and an introduction to data assimilation Ross Bannister, NCEO, September 2010

An Application of Extreme Value Analysis to U.S. Movie Box Office Returns

CHAPTER 2 PRICING OF BONDS

Bootstrapping high-frequency jump tests

Kernel Density Estimation. Let X be a random variable with continuous distribution F (x) and density f(x) = d

Sequences and Series

When you click on Unit V in your course, you will see a TO DO LIST to assist you in starting your course.

Institute of Actuaries of India Subject CT5 General Insurance, Life and Health Contingencies

Linear Programming for Portfolio Selection Based on Fuzzy Decision-Making Theory

Transcription:

A Empirical Study of the Behaviour of the Sample Kurtosis i Samples from Symmetric Stable Distributios J. Marti va Zyl Departmet of Actuarial Sciece ad Mathematical Statistics, Uiversity of the Free State, Bloemfotei, South Africa e-mail: wwjvz@ufs.ac.za Kurtosis is see as a measure of the discrepacy betwee the observed data ad a Gaussia distributio ad is defied whe the 4 th momet is fiite. I this work a empirical study is coducted to ivestigate the behaviour of the sample estimate of kurtosis with respect to sample size ad the tail idex whe applied to heavy-tailed data where the 4 th momet does ot exist. The study will focus o samples from the symmetric stable distributios. It was foud that the expected value of excess kurtosis divided by the sample size is fiite for ay value of the tail idex ad the sample estimate of kurtosis icreases as a liear fuctio of sample size ad it is approximately equal to (1 α / ). Keywords: kurtosis, stable distributio, tail idex Mathematics Subject Classificatio: 6F1; 6P05 1. Itroductio For heavy-tailed distributios, the theoretical kurtosis is defied ad fiite whe the 4 th momet is fiite or i terms of the tail idex, α, where α > 4. I practice data is observed with a ukow distributio ad kurtosis is used to measure how leptokurtic the sample is. I fiacial data it is ofte observed that α < ad the estimated kurtosis is used to get a idicatio of how leptokurtic the data is. Estimates of kurtosis i asset 1

returs rage from 4 to 50 (Egle ad Patto, 001). Heavy-tailed distributios with α < 4 is fitted to log-returs, see for example Xu,Wu ad Xiao (011). I this work a empirical study is coducted to check the behaviour ad usefuless of sample kurtosis for symmetric stable distributio with α, ad specifically where α 1which is mostly foud whe applied to real data. The mai result foud was that the expected value of the sample kurtosis icreases as a liear fuctio of the sample size ad it was foud that that for symmetric samples from stable distributios, the approximate sample estimate of kurtosis for a sample size ad tail idex α is (1 α / ). More tha oe method was suggested to estimate kurtosis but i this work the Pearso kurtosis as discussed by Fiori ad Zega (009) which is used i fiace ad risk aalysis is used. Kurtosis is defied as β ( x) = E( X µ ) / ( E( X µ )) 4 4 = µ σ, (1) 4 / with µ 4 the fourth cetral momet ad σ the variace of X. β ( ) x is locatio-scale ivariat ad all data simulated will be for a locatio parameter µ = 0 ad scale parameter σ = 1. For a regular distributio, µ µ 4, except whe the distributio is oly cocetrated at two poits (Kedall, Stuart ad Ord (1987, p.107). The excess kurtosis is

γ =, () β 3 which is also equal to γ = κ / κ whe expressed i terms of cumulats. For the 4 ormal distributio, the excess kurtosis is zero. The sample kurtosis is deoted by b ad the excess kurtosis by g = b 3. Algebraic iequalities which does ot deped o distributioal properties were derived for the sample kurtosis ad it was show that for a sample of size, x,..., 1 x, the sample estimate of kurtosis is less tha the sample size (Johso, Lowe (1979), Cox (010)), thus 1 1 b = x x x x 4 ( j ) /( ( j ) ) j= 1 j= 1 (3) = 4 ( j ) / ( ( j ) ) j= 1 j= 1 x x x x c( x,..., x ) = 1. This iequality shows that the fuctio c( x1,..., x ) 1 ad the expected value of E( c( x,..., x )) = E( b / ) 1, would be fiite for all distributios, which meas that 1 divergece of the sample kurtosis is because of a icrease i the sample size. The behaviour of the sample kurtosis will be like that of a ratio, ad ot by cosiderig the umerator ad deomiator separately as is doe i the theoretical defiitio. Usig simulatio studies it was checked if c( x1,..., x ) ca be approximated as a fuctio of 3

α. It ca also be see ad was cofirmed usig simulatio that the variace of the sample kurtosis is of the form var( c( x1,..., x )). This work will focus o symmetrical stable distributed data. Properties ad applicatios of it ca for example be foud i the work of Cizek, Härdle ad Wero, eds. (011). The characteristic fuctio of the family of stable distributios is deoted by φ ( t) where α α log φ( t) = σ t {1 iβsig( t) ta( πα / )} + iµ t, α 1, ad log φ( t) = σ t {1 + iβsig( t)( / π )log( t )} + iµ t, α = 1. The parameters are the tail idex, α (0, ], a scale parameter σ > 0, coefficiet of skewess β [ 1,1] ad locatio parameter µ. The symmetric case with β = 0 will be cosidered i this work. I the followig figure m = 500 radom samples were simulated, α 's were radomly chose o the iterval [1,] ad m = 500 radom sample sizes betwee =00 ad =1500 ad the estimated excess kurtosis plotted. The focus of this study is applicatios i fiace ad these sample sizes cover 1 to 5 years whe workig with daily data. To get a idea of the relatioship ivolved, multiple regressio was performed ad it was foud that the relatioship is approximately g α /. There is little variatio i the regressio coefficiets whe repeatig the simulatio ad this relatioship will be ivestigated further usig simulatio studies. Assumig that g (, α ) = (1 α / ), ad by otig that 4

g (, ) α = 1 α / g ad (, α ) = /, α it ca be see that the sample kurtosis is very sesitive with respect chages i α ad a slowly icreasig fuctio of the sample size. 100 1000 excess kurtosis 800 600 400 00 0 1.1 1. 1.3 1.4 1.5 1.6 1.7 1.8 1.9 α Figure 1. A scatterplot of 500 sample estimates of the excess kurtosis for radom α [1,] ad the sample size betwee 00 ad 1500. Samples from a symmetric stable distributio. The behaviour of sample skewess was checked too usig the simulated samples ad it was foud that the expected value of the sample skewess is zero for symmetric data 5

but the variace is a icreasig liear fuctio of the sample size ad icreases for smaller α. This is ot the focus of the work, but a skewess estimate i a large sample might ot be a sigificat idicatio of skewess if the large variace is take ito accout. A measure to order differet symmetric distributios accordig to the term used heavytailess was derived by va Zwet (1964), Groeeveld ad Meede (1984). It was prove that if a distributio is more heavy-tailed tha aother accordig to this measure, the kurtosis will also be larger for the heavier-tailed distributio.. Simulatio study Say a sample of size is available ad = k1 +... + kr. The sample kurtosis will be calculated at icreasig sample sizes, say k1, k1 + k, k1 + k + k3,...,, ad for differet values of α. The followig plot shows the rate of icrease i the expected value of the estimated excess kurtosis agaist the umber of observatios used to calculate it. The slope α = 1for is b = 0.4917, for α = 1.5 it is b = 0.447 ad approximately zero if α =. The average at each sample size was calculated usig m = 5000 samples. This relatioship ca be cosidered as a approximatio. A similar plot where the data is from a studet t-distributio with degrees of freedom ν = 3,4 shows that the relatioship for the t-distributio is ot liear. 6

50 00 mea estimated excess kurtosis 150 100 50 0-50 50 100 150 00 50 300 350 400 450 500 sample size Figure. Plot of average of excess sample kurtosis usig simulated samples from a symmetric stable distributio. The averages were calculated usig 5000 samples, calculated usig sample sizes = 50,100,...,500. The solid lie is for α = 1, dashed lie α = 1.5 ad dash-dot lie α =. I the followig figure the cumulative calculated excess kurtosis is show for simulated data from a t-distributio with ν = 3,4,5 degrees of freedom. It ca be see that for small degrees of freedom the relatioship betwee kurtosis ad sample size is ot liear. The liear tred icrease i kurtosis with respect to sample size for sample from a stable distributio ca thus be a useful property. It may ot be uique but if observed i a practical problem, it meas that a possible cadidate to fit might be a stable distributio. 7

0 18 16 mea estimated excess kurtosis 14 1 10 8 6 4 0 50 100 150 00 50 300 350 400 450 500 sample size Figure 3. Plot of average of excess sample kurtosis usig simulated samples from a t- distributio. The averages were calculated usig 5000 samples, calculated usig sample sizes = 50,100,...,500. The solid lie is for ν = 3, dashed lie ν = 4 ad dash-dot lie ν = 5 degrees of freedom. For a give value of α, deote by b the slope of icrease with respect to sample size. If it is assumed that the kurtosis is zero for α =, regressio through the origi ca be performed to fid the relatioship betwee the slope ( b ) of icrease with respect to sample size for a give α ad chages i the tail idex. As the sample size icreases the estimated slope is closer to exactly, leadig to the approximate relatioship, b 1 α /. If this is applied ad takig the liear relatioship betwee sample size ad sample kurtosis ito accout, oe fids that g (1 α / ). 8

Figure 3 below is based o the average of 5000 slopes calculated at each α = 1,1.1,..., ad fixed sample size = 50. 0.5 0.4 estimated slope 0.3 0. 0.1 0 0 0.1 0. 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 - α Figure 4. The relatioship betwee the estimated slope of icrease of kurtosis with respect to sample size ad α. Each poit calculated as the average of 5000 estimated value ad = 50. The expected value of the sample excess kurtosis icreases as a liear fuctio of the umber of observatios used to calculate the sample kurtosis ad the approximate expected excess kurtosis i large samples is thus E( g ) (1 α / ). (5) 9

I the above simulatios a fixed sample size was used. To cofirm the results it will be checked by usig radom sample sizes. The cosistecy of the orderig of kurtosis with tail idex will also be cosidered. As a example 5000 samples with radom sample sizes betwee =00 ad =1500 were geerated from a stable distributio with idex α = 1.5, estimated excess kurtosis divided by the sample size ad the sample mea was 0.136 compared to 1 α / = 0.150. Similarly 5000 samples with radom sample sizes ad α = 1.75 were geerated. The sample mea for α = 1.75 is 0.3668 compared to 1 α / = 0.3750. This cofirms the approximate relatioship, E( g ) (1 α / ). Comparisos were made i pairs betwee the sample kurtosis of the two samples. This resulted i approximately 8% correct larger values of the excess kurtosis whe the data is more heavy-tailed. If sample kurtosis was divided by the sample size the percetage icrease by about %. A few such examples were simulated ad whe comparig betwee a ormal samples ( α = ) ad samples with α <, the percetage correct orderig of the tail idex with respect to kurtosis is very high ad ofte as high as 100%. The coclusio ca be made that for symmetric stable distributios, kurtosis is a effective measure to compare the tail-heaviess of samples from two distributios with differet parameters. The variatio of the sample estimate is a fuctio of α as see i figure 5 ad the variace for poits used is proportioal to for all values of α, except whe α =. 10

0.09 0.08 0.07 0.06 Variace/(*)) 0.05 0.04 0.03 0.0 0.01 0 1 1.1 1. 1.3 1.4 1.5 1.6 1.7 1.8 1.9 α Figure 5. Estimated variace of sample kurtosis divided by of sample of size = 500 as a fuctio of α, based o 5000 simulated samples for each α. 3. A applicatio to log-returs The chage i kurtosis with respect to the umber of observatios used to calculate the sample kurtosis was ivestigated whe applied to log-returs of the New York stock exchage (NYSE). The daily closig values of 5 years startig i May 013 to May 018 were used. Log-returs are approximately symmetrically distributed with sample mea zero ad the stable distributio is cosidered as a possible distributio for logreturs. Log-returs are also approximately idepedetly distributed. The idex is show i figure 6. There is a iitial period, a major correctio ad the period after the correctio. 11

11500 11000 10500 NY Stock Exchage Idex 10000 9500 9000 8500 8000 7500 7000 6500 0 00 400 600 800 1000 100 1400 Figure 6. Idex of the NY stock exchage, 5 years daily data. The sample excess kurtosis of the log-returs usig icreasig sample sizes is plotted i figure 7. It ca be see that the distributio of the log-returs seems to chage ad the stay the same for a period if oe cosiders a chage i slope as a idicatio of a chage i the distributio. 1

4 3.5 estimated excess kurtosis, log-returs 3.5 1.5 1 0.5 0 0 00 400 600 800 1000 100 1400 Figure 7. Excess kurtosis of the log-returs calculated as a fuctio of the umber of poits used to calculate the sample kurtosis. Kurtosis is very sesitive with respect to chages i the idex eve though the kurtosis is calculated usig log-returs ad the differece i behaviour of the sample kurtosis over time ad before ad after the correctio is very clear. The program Stableregkw developed for Matlab by Borak, Misiorek ad Wero for the book of Cizek, Härdle ad Wero, eds. (011).based o the Kogo-Williams (Kogo, Williams (1998)) estimatio method was applied to two series of observatios 100 400 ad 800 1100 to see if there was a chage i the tail-idex as idicated by the chage i sample kurtosis. The estimated tail idex for the first period is ˆ α = 1.8554 ad ˆ α = 1.7165 for the secod period. The estimated parameters whe all 157 log-returs are used is ˆ α = 1.753, ˆ β =0.1184, σ = 0.0044. 13

This is cosistet with the chage i kurtosis, showig that the secod period ca be more volatile ad heavy-tailed. This is a period durig ad after a electio i the USA. 4. Coclusios There is a relatioship betwee kurtosis ad the tail-idex for samples from the stable distributios. For a sample of size, the sample kurtosis ca be cosidered as time the ratio of two polyomials which both are of degree 4 ad the expected value of the ratio is fiite, eve if expected value of the umerator or deomiator does ot exist. This property makes kurtosis useful i heavy-tailed data if the proportioality to is take ito accout. Thus for α > 0, 4 lim ( x j x) / ( ( x j x) ) 1 α / j = 1 j = 1. This property ca be used to compare the tail-heaviess by usig kurtosis of two samples from a stable distributio. The liear relatioship betwee the icrease as more poits are used to calculate kurtosis ca be used as a property to exclude or iclude a stable distributio as a possible distributio which ca be fitted to for example log-returs. For Garch models the 4 th momet should be fiite whe fitted to log-returs. By plottig the estimated kurtosis as a fuctio of a icreasig umber of observatios a icrease i sample kurtosis might be a idicatio that the 4 th momet is ot fiite. 14

Usig bootstrap methods to estimate a variace, the relatioship g / 1 α / ca be used i large samples to test hypotheses cocerig α ad especially to test if α <. Refereces Cox, N.J. (010). Speakig Stata: The Limits of sample skewess ad kurtosis. The Stata Joural, 3. 48 495. Čižek, P., Härdle, W.G., Wero, R. (011). Statistical Tools for Fiace ad Isurace. Spriger, Heidelberg. Egle, R.F., Patto, A.J. (001). What good is a volatility model, Quatitative Fiace, 1, 37 45. Fiori, A.M., Zega, M. (009). Karl Pearso ad the Origi of Kurtosis. Iteratioal Statistical Review, 77, 40-50. Groeeveld, R.A., Meede, G. (1984). Measurig Skewess ad Kurtosis. J. of the Royal Society, Series D, 33 (4), 391-399. Johso, M.E., Lowe, V.W. (1979). Bouds o the Sample Skewess ad Kurtosis. Techometrics, 1, 377-378. M. Kedall, A. Stuart ad J.K. Ord, Kedall s Advaced Theory of Statistics. Volume I. Charles Griffi ad Compay, Lodo, 1987. S.M.Kogo, D.B.Williams (1998) "Characteristic Fuctio Based Estimatio of Stable Distributio Parameters", i "A Practical Guide to Heavy Tails: Statistical Techiques ad Applicatios", R.J.Adler, R.E.Feldma, M.Taqqu eds., Birkhauser, Bosto, 311-335. 15

Va Zwet, W.R. (1964). Covex Trasformatios of Radom Variables. Math. Cetrum, Amsterdam. Xu, W., Wu, C., Dog, W. (011). Modelig Chiese stock returs with stable distributios. Mathematical ad Computer Modellig, 54, 610 617. 16