An Improved Version of Kurtosis Measure and Their Application in ICA

Similar documents
Multivariate Outlier Detection Using Independent Component Analysis

PARAMETRIC AND NON-PARAMETRIC BOOTSTRAP: A SIMULATION STUDY FOR A LINEAR REGRESSION WITH RESIDUALS FROM A MIXTURE OF LAPLACE DISTRIBUTIONS

Robust Critical Values for the Jarque-bera Test for Normality

MEASURING PORTFOLIO RISKS USING CONDITIONAL COPULA-AR-GARCH MODEL

On Some Statistics for Testing the Skewness in a Population: An. Empirical Study

A Skewed Truncated Cauchy Logistic. Distribution and its Moments

International Journal of Computer Engineering and Applications, Volume XII, Issue II, Feb. 18, ISSN

On Some Test Statistics for Testing the Population Skewness and Kurtosis: An Empirical Study

Market Risk Analysis Volume IV. Value-at-Risk Models

International Journal of Computer Engineering and Applications, Volume XII, Issue II, Feb. 18, ISSN

Chapter 7. Inferences about Population Variances

The Two-Sample Independent Sample t Test

GENERATION OF STANDARD NORMAL RANDOM NUMBERS. Naveen Kumar Boiroju and M. Krishna Reddy

On the Distribution and Its Properties of the Sum of a Normal and a Doubly Truncated Normal

Lecture 6: Non Normal Distributions

14.1 Moments of a Distribution: Mean, Variance, Skewness, and So Forth. 604 Chapter 14. Statistical Description of Data

Business Statistics 41000: Probability 3

CABARRUS COUNTY 2008 APPRAISAL MANUAL

Segmentation and Scattering of Fatigue Time Series Data by Kurtosis and Root Mean Square

Effects of skewness and kurtosis on model selection criteria

Fitting financial time series returns distributions: a mixture normality approach

Alternative VaR Models

Numerical Measurements

A New Test for Correlation on Bivariate Nonnormal Distributions

Consistent estimators for multilevel generalised linear models using an iterated bootstrap

Some Characteristics of Data

UNIVERSITY OF VICTORIA Midterm June 2014 Solutions

Statistics for Business and Economics

STRESS-STRENGTH RELIABILITY ESTIMATION

Much of what appears here comes from ideas presented in the book:

Module Tag PSY_P2_M 7. PAPER No.2: QUANTITATIVE METHODS MODULE No.7: NORMAL DISTRIBUTION

A RIDGE REGRESSION ESTIMATION APPROACH WHEN MULTICOLLINEARITY IS PRESENT

A Skewed Truncated Cauchy Uniform Distribution and Its Moments

THE USE OF THE LOGNORMAL DISTRIBUTION IN ANALYZING INCOMES

Technical Note: An Improved Range Chart for Normal and Long-Tailed Symmetrical Distributions

A Convenient Way of Generating Normal Random Variables Using Generalized Exponential Distribution

General structural model Part 2: Nonnormality. Psychology 588: Covariance structure and factor models

Financial Data Mining Using Flexible ICA-GARCH Models

Stock Price and Index Forecasting by Arbitrage Pricing Theory-Based Gaussian TFA Learning

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 7.4-1

Applying Independent Component Analysis to Factor Model in Finance

An Application of Extreme Value Theory for Measuring Financial Risk in the Uruguayan Pension Fund 1

Statistical Methodology. A note on a two-sample T test with one variance unknown

Linda Allen, Jacob Boudoukh and Anthony Saunders, Understanding Market, Credit and Operational Risk: The Value at Risk Approach

COMPARISON OF RATIO ESTIMATORS WITH TWO AUXILIARY VARIABLES K. RANGA RAO. College of Dairy Technology, SPVNR TSU VAFS, Kamareddy, Telangana, India

Study on Financial Market Risk Measurement Based on GJR-GARCH and FHS

CHAPTER 2 Describing Data: Numerical

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

MODELLING OF INCOME AND WAGE DISTRIBUTION USING THE METHOD OF L-MOMENTS OF PARAMETER ESTIMATION

Window Width Selection for L 2 Adjusted Quantile Regression

Learning Objectives for Ch. 7

KARACHI UNIVERSITY BUSINESS SCHOOL UNIVERSITY OF KARACHI BS (BBA) VI

Where s the Beef Does the Mack Method produce an undernourished range of possible outcomes?

ESTIMATION OF MODIFIED MEASURE OF SKEWNESS. Elsayed Ali Habib *

درس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی

The misleading nature of correlations

Project Proposals for MS&E 444. Lisa Borland and Jeremy Evnine. Evnine and Associates, Inc. April 2008

Computational Statistics Handbook with MATLAB

Financial Econometrics (FinMetrics04) Time-series Statistics Concepts Exploratory Data Analysis Testing for Normality Empirical VaR

Chapter 6 Simple Correlation and

Introduction. Tero Haahtela

Can we use kernel smoothing to estimate Value at Risk and Tail Value at Risk?

Generalized Modified Ratio Type Estimator for Estimation of Population Variance

Stochastic model of flow duration curves for selected rivers in Bangladesh

Maximum Likelihood Estimation

Fat Tailed Distributions For Cost And Schedule Risks. presented by:

Market Risk Analysis Volume I

Bias Reduction Using the Bootstrap

Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS048) p.5108

Volatility Clustering of Fine Wine Prices assuming Different Distributions

SELECTION OF INDEPENDENT FACTOR MODEL IN FINANCE. Lai-Wan Chan and Siu-Ming Cha

AN ARTIFICIAL NEURAL NETWORK MODELING APPROACH TO PREDICT CRUDE OIL FUTURE. By Dr. PRASANT SARANGI Director (Research) ICSI-CCGRT, Navi Mumbai

Incorporating Model Error into the Actuary s Estimate of Uncertainty

Background. opportunities. the transformation. probability. at the lower. data come

Shape Measures based on Mean Absolute Deviation with Graphical Display

A Demonstration of the Central Limit Theorem Using Java Program

Two-term Edgeworth expansions of the distributions of fit indexes under fixed alternatives in covariance structure models

Analysis of truncated data with application to the operational risk estimation

Market Risk Analysis Volume II. Practical Financial Econometrics

Quality Digest Daily, March 2, 2015 Manuscript 279. Probability Limits. A long standing controversy. Donald J. Wheeler

The Two Sample T-test with One Variance Unknown

2015, IJARCSSE All Rights Reserved Page 66

Strategies for Improving the Efficiency of Monte-Carlo Methods

Subject CS1 Actuarial Statistics 1 Core Principles. Syllabus. for the 2019 exams. 1 June 2018

Power of t-test for Simple Linear Regression Model with Non-normal Error Distribution: A Quantile Function Distribution Approach

MEMBER CONTRIBUTION. 20 years of VIX: Implications for Alternative Investment Strategies

STOCHASTIC COST ESTIMATION AND RISK ANALYSIS IN MANAGING SOFTWARE PROJECTS

ELEMENTS OF MONTE CARLO SIMULATION

Chapter 4 Variability

[D7] PROBABILITY DISTRIBUTION OF OUTSTANDING LIABILITY FROM INDIVIDUAL PAYMENTS DATA Contributed by T S Wright

Random Variables and Probability Distributions

XLSTAT TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING

A Comparison of Univariate Probit and Logit. Models Using Simulation

STAT 509: Statistics for Engineers Dr. Dewei Wang. Copyright 2014 John Wiley & Sons, Inc. All rights reserved.

STAT Chapter 6: Sampling Distributions

Resampling Methods. Exercises.

Week 1 Quantitative Analysis of Financial Markets Distributions B

Distribution analysis of the losses due to credit risk

A New Hybrid Estimation Method for the Generalized Pareto Distribution

TABLE OF CONTENTS - VOLUME 2

Transcription:

International Journal of Wireless Communication and Information Systems (IJWCIS) Vol 1 No 1 April, 011 6 An Improved Version of Kurtosis Measure and Their Application in ICA Md. Shamim Reza 1, Mohammed Nasser and Md. Shahjaman 1 Pabna Science & Technology University, Department of Mathematics, Rajapur, Pabna-6600, Bangladesh. Department of Statistics, Rajshahi University, Department of Statistics, Begum Rokeya University, mshamim_stat@yahoo.com, mnasser.ru@gmail.com, shahjaman_brur@yahoo.com Abstract: Kurtosis plays important role in defining shape characteristics of a probability distribution, and also in extracting as well as sorting independent components. From recent research on various versions of classical kurtosis we see that all the measures substantially underestimate kurtosis parameter and exhibit high variability when underlying population distribution is highly skewed or heavy tailed. This is unwanted for independent component analysis (ICA). In this paper, we propose a bootstrap bias corrected kurtosis estimator and compare performances of proposed bootstrap bias corrected estimator with two empirical bias corrected kurtosis measure that is found best in recent works. We use both simulated and real data and investigate the bias, standard error, MSE of each estimator under a variety of situations and also take help of various plots to judge their performances. We observe that our proposed bootstrap bias corrected kurtosis estimators perform better than the class of classical estimators for non-normal situations of univariate population. We then apply our measure in sorting independent components of both data sets and try to examine the capacity of PCA, ICA and ICA on PCA for finding groups. In both data sets ICA on PCA a new visualization technique shows the maximum discriminating power whereas PCA the least. We recommend using our proposed measure in both extracting and sorting independent components. Keywords: Kurtosis, Monte Carlo Simulation, Bootstrapping, PCA, ICA. 1. Introduction It is typically noted in introductory statistics courses that distributions can be characterized in terms of central tendency, variability, and shape. With respect to shape, virtually every textbook defines and illustrates skewness. On the other hand, another aspect of shape, which is kurtosis, is either not discussed or, worse yet, is often described or illustrated incorrectly ( DeCarlo, 1997 and Joanes et al., 1998). Kurtosis is also useful for IC s ordering (Scholz et al., 00; Scholz and Selbig, 007 etc). In principal component analysis, PC s are ordered by corresponding eigen values. But in independent component analysis, these components have no order. For practical reasons to define a criterion for sorting these components to our interest. One measurement which can match our interest very well, is kurtosis. In a recent work Lihua and Ahmed (008) proposed two unbiased sample measures of kurtosis and compared them with three sample measures of kurtosis adapted by various software packages (Minitab, SAS etc) for data from normal and nonnormal populations. Their proposed second estimator is the best performer in normal situations but in non-normal situations all estimators show unwanted large fluctuations. For this reason they put forward two new empirical bias corrected kurtosis estimator. In order to correct the bias, their empirical formulas are provided only for student-t and chisquared distributions. However, empirical estimates are subject to extra variation which results in inflated MSE. In this article we place a bootstrap bias corrected kurtosis estimator. It is worth mentioning that ICA is meaningful for non-normal situation. For purely Gaussian (Normal) distributed data, no unique independent components can be extracted (Hyvarinen and Oja. 000). In section we define the classical measure of kurtosis estimator that we consider in our study. Section we propose a bootstrap bias-corrected kurtosis estimator. In section we compare two empirical bias corrected kurtosis estimators with our propose bootstrap bias corrected kurtosis estimator and finds the overall best performer. We then apply our estimator in sorting independents components for finding data clustering. The final section gives conclusion.. Kurtosis Pearson (1905) introduced kurtosis as a measure of how flat the top of a symmetric distribution is when compared to a normal distribution of the same variance. Kurtosis can be formally defined as the standardized fourth population moment about the mean E X E X Where E is the expectation operator, is the mean, is the fourth moment about the mean, and is the standard deviation. The normal distribution has a kurtosis of, and is often used so that the reference normal distribution has a kurtosis zero. A sample counterpart to can be obtained by replacing population moments with the sample moments, which gives b X n X X i X i (1)

International Journal of Wireless Communication and Information Systems (IJWCIS) Vol 1 No 1 April, 011 7 where b is the sample kurtosis, X bar is the sample mean, and n is the number of observations..1 Some Classical Measure of Kurtosis Estimator Let X 1, X,,X n be a random sample of size n, then a commonly used consistent estimator of is given by x xi x () x i The above estimator is not unbiased. Cramer (196) gave the amount of bias of the following results for normal distributions: 6 Bias () n 1 Another frequently used estimator of adopted by SAS is defined as U n 1 n 1 n n 6 It has been proved that is unbiased for normal distributions. We refer to Fisher (199), Joanes and Gill (1998) and others. U () The kurtosis measure adapted by MINITAB is defined by M n 1 n Joanes and Gill (1998) showed that for normal distributions M Bias n 1 (5) 1 (6) n n 1 n 1 Recently developed two Kurtosis estimator proposed by Lihua and Ahmed(008). They are correcting the bias given in () and (6) yields two new estimators as follows: N 1 6 n 1 (7) And N M n 1 1 (8) n n 1 Consequentially, for normal data, unbiased estimators of. N1 and N are both All five estimators are biased for non-normal populations, and bias is inflated in a range of the parameter space, For detailed description, we refer to Lihua and Ahmed (008) article. It seems to be an appealing idea to construct a biascorrected estimator. For student-t and Chi-square distribution, non-normal situations Lihua and Ahmed (008) suggest employing a bias-reduction technique based on the N best performing estimator. They proposed a new biascorrected estimator may be defined as N N 1 N Simulation experiment is conducted to inspect the bias and MSE of the estimators. The result shows that these estimators effectively reduced the bias to a negligible level; however, extremely large variance was introduced due to the quadratic form, resulting in inflated MSE. They also proposed, a simple linear regression model without independent variable N N 1 N for only small degrees of freedom. The variance of this fitted estimator is greater than the original biased estimator; however it is not inflated too much... Limitations of the empirical bias-correction estimators a) The main problem of the above empirical bias correction is that, these formulas are provided only for student-t and chi-squared distributions. But doesn t consider other distribution to correct the bias. b) The performance of this bias-corrected estimator depends on a table as well as specified sample size. c) The empirical bias correction estimators effectively reduced the bias but extremely large variance was introduced.. Propose Bootstrap Bias Corrected Estimator All the estimators substantially underestimate kurtosis parameter when underlying population distribution is highly skewed or heavy tailed. In order to correct the bias, empirical formulas are provided for student-t and chi-squared distributions. However, empirical estimates are subject to extra variation introduced which results in inflated MSE. Perhaps, some re-sampling methods such as bootstrap and Jackknife may be considered to reduce the bias as well as keeping a relatively lower variance. Thus we want to use a popular re-sampling method bootstrapping, to overcome the problem of empirical bias-correction. For correcting the bias using bootstrap, we use second estimator of Lihua and Ahmed N. Because of N estimator performs well for normal as well as non-normal populations in many situations. Finally, our propose bootstrap bias corrected estimator is given by Where N N Bia Bia s boot * t x tf t. tf * s boot EFn n n

International Journal of Wireless Communication and Information Systems (IJWCIS) Vol 1 No 1 April, 011 8 t B * b. 1 t B x * b.1 Our Bootstrapping Method and Used Estimator In our research we use 5000 bootstrap samples for calculating bias and MSE of sizes n = 0, n = 0, n = 50, each number of replicated 1000 times. Each respective sample take from student-t and chi-squared distribution with d.f and 5. Then we have got the bootstrapped aggregated results and comparing bootstrap bias corrected MSE and empirical bias corrected MSE. We know that kurtosis of student-t and chi-squared distributions are, n n Kur t where n> 1 Kur n The bootstrap aggregated bias calculate as follows Bias 5000 N i1 5000 MSE Var boot Kurtosis N Bias, then performs for any distributions, any sample size but empirical bias-corrected estimator performs only student-t and chisquared distributions for specified sample size.. Results To simulate skewed and heavy tailed data, 5000 samples of sizes 0, 0 and 50 are randomly taken from and studentt distribution with degrees of freedom and 5. Now we compare among our proposed bootstrap bias corrected estimator and two empirical bias-corrected estimators. We find bias, mean square error of bias correction estimators at different non-normal populations. The results are represented in different tables and plots. Table 1.MSE comparison of chi-square distribution Sample Size d.f Bootstrap log(mse) Emperical-1 Emperical- 0.61.7.7 0 5.19.6. 0.5.15.5 0 5.1.09.1 50..91.5 50 5.0.8.10 Figure 1, Bootstrap algorithm for calculating bias corrected estimator. To simulate skewed and heavy tailed data, 5000 samples of sizes 0,0 and 50 are randomly taken from χ and student t distribution with degrees of freedom and 5. Now we compare among our proposed bootstrap bias-corrected estimator and two empirical bias-corrected estimators. Our proposed bootstrap bias-corrected estimator is more advantages than empirical estimators because our estimators Figure. MSE comparison for Chi-square distribution The Table-1 and fig.1 shows that the proposed bootstrap bias corrected measure gives the minimum MSE values for (skewed) distributions of sizes 0, 0 and 50 with df, 5. We found that our proposed estimators give greater discrepancy than first empirical correction but relatively lower difference than second empirical bias-corrected estimator based on MSE criterion. The table shows that the proposed bootstrap bias corrected measure gives the minimum MSE values than first empirical bias-corrected estimator for student-t (heavy tailed) distributions of sizes 0, 0 and 50 with df,5. We also found that second empirical estimator performs well than our

International Journal of Wireless Communication and Information Systems (IJWCIS) Vol 1 No 1 April, 011 9 estimators for d.f, but results in favor of our estimators when df increases to 5. Sample Size Table.MSE comparison of t-distribution d.f Bootstrap log(mse) Emperical-1 Emperical- 0.0 9.8.9 0 5.18 8.6.9 0. 10.1.9 0 5.1 8.16. 50.1 10.88.80 50 5.11 8.10.18 Oja. 000), therefore, ICA should only be applied to data sets where we can find components that have a non-gaussian distribution. Examples of super-gaussian distributions (highly positive kurtosis) are speech signals, because these are predominantly close to zero. However, for molecular data sub-gaussian distributions (negative kurtosis) are more interesting. Negative kurtosis can indicate a cluster structure or at least a uniformly distributed factor. Thus the components with the most negative kurtosis can give us the most relevant information. Experiment-1 (Simulation Study) In our research, first we generate four known distribution Normal, Chi-square, t and Uniform of size 100 with taken their different mean, mixing this four distribution and finding out which visualization techniques gives better identification of distribution pattern from mixture. 5. Application in ICA Independent component analysis (ICA) is a statistical method used to discover hidden factors(sources or features) from a set of measurements or observed data such that the sources are maximally independent. The ICA algorithms are able to separate the sources according to the distribution of the data. Independent component analysis (ICA) (Hyvarinen et al., 001), and projection pursuit (PP)(Jones and Sibson, 1987), are closely related techniques, which try to look for interesting directions (projections) in the data. ICA assumes a model, x = AS where x is a vector of observed random variables, A is a d d mixing matrix, and S is a vector of independent latent variables. The task then is to find A to recover S. A key assumption is usually that the S have different kurtosises K j, in order to separate the different independent components. In practice ICA usually measures interestingness of a linear combination a T x in terms of the size of its absolute kurtosis or some related measures. Since for a Gaussian random variables the kurtosis is zero, this criterion measures to some extent, non-gaussianity. j Figure. Original pattern of simulated data (a) Normal (b) Chi square (c) t (d) Uniform distribution. 5.1 Role of Kurtosis in ICA In principal component analysis, pc s are ordered by eigen value where first eigen value is first pc, second eigen value second pc and so on. But in independent component analysis, These components have no order. For practical reasons to define a criterion for sorting these components to our interest. One measurement which can match our interest very well, is kurtosis. Kurtosis is a classical measure of non- Gaussianity, and is computationally and theoretically relatively simple. It indicates whether the data are peaked or flat, relative to a Gaussian (normal) distribution. A Gaussian distribution has a kurtosis of zero. Positive kurtosis indicates a peaked distribution (super-gaussian) and negative kurtosis indicates a flat distribution (sub-gaussian). Now mixing this four distribution (Original sources), and if we apply PCA, ICA and ICA on PCA on the mixture data experiment, to investigate what techniques gives better identification. From purely Gaussian distributed data, no unique independent components can be extracted (Hyvarinen and

International Journal of Wireless Communication and Information Systems (IJWCIS) Vol 1 No 1 April, 011 10 Experiment- (Experiment of Breast cancer data) In breast cancer data, contains 10 variables and 107 observations. When we apply PCA we see that loadings of the first five PC s that explains 8 percent variability of the data set. Now we apply PCA and ICA for original data and ICA apply on 5 pc s and use our estimator to sorting IC s. Table (IC s ordering using kurtosis) Figure.. Mixed Sources of four distribution From the above table the largest negative value of kurtosis is -1.1 which is consider first IC s, second largest second IC s and so on. Since negative kurtosis can indicate a cluster structure or at least a uniformly distributed factor. Thus the components with the most negative kurtosis can give us the most relevant information. Figure.5. Performance of different visualization techniques for chi-square distribution. Fig. exhibits PCA and ICA could not detect the required distribution properly, but ICA on PCA as a new development of visualization technique for our experiment, we obtain that for the last case we get the maximum identification of the chi-square distribution, which is our required result for our experiment. Figure.7. On the left, by applying PCA to the total data, the result is worse than the result of ICA. However, by using PCA for preprocessing before applying ICA, a more strongly discriminating component can be extracted, as shown on the right. Figure.6. Performance of different visualization techniques for t and uniform distribution. Fig.5 shows the identification performance of t and uniform distribution, and we see the both cases PCA fails the proper identification, ICA performs well than PCA but ICA on PCA gives better discriminates of the two distributions. 6. Conclusion In this paper we describe five sample measures of kurtosis estimators and comparing the performances of three(empirical-1, empirical- and propose bootstrap) biascorrected kurtosis estimators. Their performances are investigated through simulation and bootstrapping. we consider χ and student-t distribution with three different sample sizes (0, 50 and 50). The estimators are compared with regard to bias and MSE, the bootstrap bias-corrected

International Journal of Wireless Communication and Information Systems (IJWCIS) Vol 1 No 1 April, 011 11 estimators, especially non-normal population for small degrees of freedom performs better than the class of two empirical bias-corrected estimators. We recommend using as a measure of kurtosis especially when the degrees of freedom small as well as large and non-normal population. We then apply our measure in sorting independent components in simulated and Breast cancer data, and try to examine the capacity of PCA, ICA and ICA on PCA for finding groups. In both data sets ICA on PCA a new visualization technique shows the maximum discriminating power whereas PCA the least. References [1] Cramer, H., Mathematical Methods of Statistics, Princeton University Press, Princeton, p. 86. 196. [] DeCarlo, L.T. On the meaning and use of kurtosis. Psychological Methods (), 907. 1997. [] Fisher, R.A., Moments and product moments of sampling distributions. Proc. London Math. Soc. Ser. 0, 1998. 199. [] Hyv arinen, A. and Oja, E.: Independent component analysis: Algorithms and applications. Neural Networks. -5(1):11-0. 000. [5] Hyvarinen, A., Karhunen, J. and Oja, E. Independent Component Analysis, John Wiley and Sons, NewYork.001. [6] Jones,M. and Sibson, R. What is projection pursuit? J. of the Royal Statistical Society, Ser. A, 150:1-6. 1987. [7] Joanes, D.N., Gill, C.A., Comparing measures of sample skewness and kurtosis. Statist. 7, 18-189. 1998. [8] Lihua An, S.Ejaz Ahmed. Improving the performance of kurtosis estimator. Computational Statistics and Data Analysis 5, 669-681. 008. [9] Matthias Scholz, Yves Gibon, Mark Stitt and Joachim Selbig, Independent component analysis of starch deficient pgm mutants. Proceedings of the German conference on Bioinformatics. Gesellschaft fur infomark, Bonn, pp.95-10,00. [10] Scholz, M., Gatzek, S., Sterling, A., Fiehn, O., and Selbig, J. Metabolite fingerprinting: detecting biological features by independent component analysis. Bioinformatics 0, 7-5, 00. [11] Shamim, M. Nasser, An improved version of kurtosis estimator and their application in ICA 1International conference on computer and information Technology, Program book, page-7, 010. [1] Scholz, M., and Selbig, J. Visualization and analysis of molecular data. Methods Mol Biol 58, 87-10, 007.