Symmetricity of the Sampling Distribution of CV r for Exponential Samples

Similar documents
MODELLING OF INCOME AND WAGE DISTRIBUTION USING THE METHOD OF L-MOMENTS OF PARAMETER ESTIMATION

Chapter 3 Statistical Quality Control, 7th Edition by Douglas C. Montgomery. Copyright (c) 2013 John Wiley & Sons, Inc.

Process capability estimation for non normal quality characteristics: A comparison of Clements, Burr and Box Cox Methods

GENERATION OF STANDARD NORMAL RANDOM NUMBERS. Naveen Kumar Boiroju and M. Krishna Reddy

ESTIMATION OF MODIFIED MEASURE OF SKEWNESS. Elsayed Ali Habib *

Probability and Statistics

Subject CS1 Actuarial Statistics 1 Core Principles. Syllabus. for the 2019 exams. 1 June 2018

An Improved Skewness Measure

A Skewed Truncated Cauchy Logistic. Distribution and its Moments

4-2 Probability Distributions and Probability Density Functions. Figure 4-2 Probability determined from the area under f(x).

Bayesian Inference for Volatility of Stock Prices

KURTOSIS OF THE LOGISTIC-EXPONENTIAL SURVIVAL DISTRIBUTION

A Skewed Truncated Cauchy Uniform Distribution and Its Moments

Shape Measures based on Mean Absolute Deviation with Graphical Display

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1

On Performance of Confidence Interval Estimate of Mean for Skewed Populations: Evidence from Examples and Simulations

Data Distributions and Normality

Describing Uncertain Variables

Chapter 7: Point Estimation and Sampling Distributions

Continuous random variables

Fundamentals of Statistics

Monte Carlo Simulation (Random Number Generation)

Probability Weighted Moments. Andrew Smith

R. Kerry 1, M. A. Oliver 2. Telephone: +1 (801) Fax: +1 (801)

Pakistan Export Earnings -Analysis

Session Window. Variable Name Row. Worksheet Window. Double click on MINITAB icon. You will see a split screen: Getting Started with MINITAB

REINSURANCE RATE-MAKING WITH PARAMETRIC AND NON-PARAMETRIC MODELS

**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:

We will also use this topic to help you see how the standard deviation might be useful for distributions which are normally distributed.

Exploring Data and Graphics

M249 Diagnostic Quiz

Much of what appears here comes from ideas presented in the book:

Frequency Distribution and Summary Statistics

THE USE OF THE LOGNORMAL DISTRIBUTION IN ANALYZING INCOMES

Random Variables and Probability Distributions

CAS Course 3 - Actuarial Models

Statistics & Flood Frequency Chapter 3. Dr. Philip B. Bedient

KARACHI UNIVERSITY BUSINESS SCHOOL UNIVERSITY OF KARACHI BS (BBA) VI

PARAMETRIC AND NON-PARAMETRIC BOOTSTRAP: A SIMULATION STUDY FOR A LINEAR REGRESSION WITH RESIDUALS FROM A MIXTURE OF LAPLACE DISTRIBUTIONS

Chapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi

Statistics for Managers Using Microsoft Excel/SPSS Chapter 6 The Normal Distribution And Other Continuous Distributions

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment

Simple Descriptive Statistics

Market Risk Analysis Volume I

Some Characteristics of Data

Fitting parametric distributions using R: the fitdistrplus package

Simple Formulas to Option Pricing and Hedging in the Black-Scholes Model

Contents Part I Descriptive Statistics 1 Introduction and Framework Population, Sample, and Observations Variables Quali

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

ELEMENTS OF MONTE CARLO SIMULATION

Descriptive Analysis

ANALYSIS OF THE DISTRIBUTION OF INCOME IN RECENT YEARS IN THE CZECH REPUBLIC BY REGION

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Contents. An Overview of Statistical Applications CHAPTER 1. Contents (ix) Preface... (vii)

A Convenient Way of Generating Normal Random Variables Using Generalized Exponential Distribution

Normal Distribution. Definition A continuous rv X is said to have a normal distribution with. the pdf of X is

On Some Statistics for Testing the Skewness in a Population: An. Empirical Study

Simulation of Moment, Cumulant, Kurtosis and the Characteristics Function of Dagum Distribution

Web Science & Technologies University of Koblenz Landau, Germany. Lecture Data Science. Statistics and Probabilities JProf. Dr.

St. Xavier s College Autonomous Mumbai. Syllabus For 2 nd Semester Course in Statistics (June 2015 onwards)

MgtOp 215 TEST 1 (Golden) Spring 2016 Dr. Ahn. Read the following instructions very carefully before you start the test.

How To: Perform a Process Capability Analysis Using STATGRAPHICS Centurion

Chapter 6 Simple Correlation and

COMPARATIVE ANALYSIS OF SOME DISTRIBUTIONS ON THE CAPITAL REQUIREMENT DATA FOR THE INSURANCE COMPANY

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -5 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc.

Modelling component reliability using warranty data

Quality Digest Daily, March 2, 2015 Manuscript 279. Probability Limits. A long standing controversy. Donald J. Wheeler

St. Xavier s College Autonomous Mumbai STATISTICS. F.Y.B.Sc. Syllabus For 1 st Semester Courses in Statistics (June 2015 onwards)

Data Simulator. Chapter 920. Introduction

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.

My poster in 180 seconds : Evaluation of alternative robust methods for anti-drug antibodies cut-point determination

Frequency Distribution Models 1- Probability Density Function (PDF)

Simulation of probability distributions commonly used in hydrological frequency analysis

ก ก ก ก ก ก ก. ก (Food Safety Risk Assessment Workshop) 1 : Fundamental ( ก ( NAC 2010)) 2 3 : Excel and Statistics Simulation Software\

MAS1403. Quantitative Methods for Business Management. Semester 1, Module leader: Dr. David Walshaw

32.S [F] SU 02 June All Syllabus Science Faculty B.A. I Yr. Stat. [Opt.] [Sem.I & II] 1

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

Certified Quantitative Financial Modeling Professional VS-1243

Power of t-test for Simple Linear Regression Model with Non-normal Error Distribution: A Quantile Function Distribution Approach

discussion Papers Some Flexible Parametric Models for Partially Adaptive Estimators of Econometric Models

Truncated Life Test Sampling Plan under Log-Logistic Model

On the Distribution and Its Properties of the Sum of a Normal and a Doubly Truncated Normal

2 Exploring Univariate Data

Commonly Used Distributions

Cambridge University Press Risk Modelling in General Insurance: From Principles to Practice Roger J. Gray and Susan M.

Chapter 7. Inferences about Population Variances

Notes on bioburden distribution metrics: The log-normal distribution

Lecture 2 Describing Data

Market Volatility and Risk Proxies

Background. opportunities. the transformation. probability. at the lower. data come

A New Right Tailed Test of the Ratio of Variances

Distortion operator of uncertainty claim pricing using weibull distortion operator

Homework Problems Stat 479

QQ PLOT Yunsi Wang, Tyler Steele, Eva Zhang Spring 2016

On modelling of electricity spot price

A Markov Chain Monte Carlo Approach to Estimate the Risks of Extremely Large Insurance Claims

Survival Analysis APTS 2016/17 Preliminary material

9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives

The Application of the Theory of Power Law Distributions to U.S. Wealth Accumulation INTRODUCTION DATA

GENERATION OF APPROXIMATE GAMMA SAMPLES BY PARTIAL REJECTION

Transcription:

World Applied Sciences Journal 17 (Special Issue of Applied Math): 60-65, 2012 ISSN 1818-4952 IDOSI Publications, 2012 Symmetricity of the Sampling Distribution of CV r for Exponential Samples Fauziah Maarof, Aimi Athirah Ahmad and Shamsiah Mohammed Department of Mathematics, Faculty of Science, UPM, 43400 Serdang, Selangor Abstract: A robust form of the coefficient of variation, CV r constructed using the sample median absolute deviation, MAD and the sample median, is considered. Histograms, boxplots and descriptive statistics of the sampling distributions of CV r are generated for samples from exponential distribution with various parameter values (λ = 0.05, 0.1, 0.2, 1, 2, 10, 20) and sample sizes (n=10, 20, 30, 50, 70, 100, 300, 500). Based on the histograms, boxplots and measures of skewness, it is observed that the sampling distribution of CV r is left-skewed for small sample sizes and becomes more symmetric as the sample size increases. For purposes of comparison, similar procedure is also applied to the sampling distribution of the conventional coefficient of variation, CV c for the same samples generated in each case. It is observed that the sampling distribution of CV r outperforms the degree of and rate towards symmetricity of CV c. The property of large sample symmetricity will be an advantage for future development of test statistics using CV r.it is also observed that the sampling distribution of CV r is less peaked for all n and λ compared to that of CV c. Key words: Symmetricity coefficient of variation robust exponential INTRODUCTION Conventionally, the sample coefficient of variation (CV c ) is the ratio of the sample standard deviation to the sample mean, presented as a percentage [1, 2]. In univariate situation, this unitless measure describes the dispersion of the values of the variable with respect to the mean. Hence, the larger the value of CV c the greater is the dispersion in the variable. It s common use is in the comparison of two or more data sets that are measured in different units of measurement. Research results on some of the properties of CV c have been documented in many papers, among which are those by Breunig [3], Curto and Pinto [4], Hendricks and Robey [5], Hurlimann [6], Mahmoudvand et al. [7], Murari [8] and Pang et al. [9]. Applications of CV c in the assessment and comparison of data sets in various fields, such as economics and finance, sociology, engineering and health science, can be observed papers by Bedeian and Mossholder [10], Cox and Sadiraj [11], He and Oyadiji [12], Martin and Gray [13] and Reed et al. [14]. An alternative robust form of the cv is given in [15] as: where paper is to provide the readers, especially those who commonly apply the CV c measure in their areas of research, an insight on the comparison between CV c and CV r with respect to their respective sampling distributions. In this short paper, the sampling distributions are compared by observing the changes in the histograms, boxplots and basic descriptive statistics with emphasis on the mean, standard deviation, skewness and kurtosis as the sample size is increased from 10 to 500 at each level of λ. METHODOLOGY The CV c and CV r are calculated for samples generated from the exponential distribution with parameter λ whose probability density and cumulative distribution function are given as and respectively [16]. The two forms of coefficient of variation (cv) under consideration are the conventional cv,, where s is the sample standard deviation and is the sample mean and a robust form of the cv, is the sample MAD and med(n) is the commonly referred to sample median. The main objective of this where. Corresponding Author: Fauziah Maarof, Department of Mathematics, Faculty of Science, UPM, 43400 Serdang, Selangor 60

A total of 1000 cv s of each form are generated for each of the sample sizes n=10, 20, 30, 50, 70, 100, 300, 500 from the Exp(λ) distribution, for λ=0.05, 0.1, 0.2, 1, 2, 10, 20. Then, for each case, histograms, boxplots, line graphs and descriptive statistics of the samples of each form of cv are obtained. The R-language facilities of the statistical software R [17] are employed in all simulation and plotting all the necessary figures and graphs. RESULTS AND DISCUSSION A total of 112 figures consisting of histograms and boxplots were produced in this project. For illustrative purposes, we only present histograms and boxplots for samples generated from Exponential (1) for sample sizes n=20, 30, 50, 100 and 500. For those who are interested in observing the rest of the figures may contact anyone of the authors. Based on all the figures produced for all the 56 cases, it is observed that, for each λ, the sampling distribution of CV r is left-skewed for small n s while that of CV c is right-skewed. Interestingly, the figures and statistics show that the sampling distribution of CV r approaches faster and closer towards symmetricity (skewness coefficient decreasing to 0) than that of CV c Fig. 2: Histogram and boxplot for CVc and CVr for Exp(λ=1) with n=30 Fig. 1: Histogram and boxplot for CVc and CVr for Exp(λ=1) with n=20 61 Fig. 3: Histogram and boxplot for CVc and CVr for Exp(λ=1) with n=50

Table 1: Mean, standard deviation and coefficient of variation for sampling distribution of conventional CV, CV c and robust CV, CV r 10 20 30 50 70 100 300 500 -------------------- -------------------- -------------------- -------------------- -------------------- ------------------- ------------------- -------------------- λ n CVc CVr CVc CVr CVc CVr CVc CVr CVc CVr CVc CVr CVc CVr CVc CVr 0.05 m 0.9293 1.2900 0.9558 1.3707 0.9692 1.3891 0.9863 1.4220 0.9835 1.4267 0.9850 1.4228 0.9944 1.4360 0.9978 1.4388 sd 0.2345 0.3677 0.1816 0.2753 0.1656 0.2342 0.1290 0.1890 0.1122 0.1588 0.0952 0.1370 0.0549 0.0774 0.0438 0.0602 cv 0.2523 0.2850 0.1900 0.2009 0.1709 0.1686 0.1308 0.1329 0.1141 0.1113 0.0966 0.0963 0.0552 0.0539 0.0439 0.0418 0.1 m 0.9171 1.2617 0.9669 1.3734 0.9781 1.3789 0.9698 1.4087 0.9885 1.4232 0.9903 1.4240 0.9990 1.4387 0.9987 1.4404 sd 0.2364 0.3683 0.1878 0.2819 0.1588 0.2352 0.1201 0.1909 0.1099 0.1535 0.0947 0.1332 0.0569 0.0794 0.0450 0.0618 cv 0.2578 0.2919 0.1942 0.2053 0.1624 0.1706 0.1238 0.1355 0.1112 0.1079 0.0956 0.0935 0.0570 0.0552 0.0451 0.0429 0.2 m 0.9218 1.2642 0.9582 1.3561 0.9669 1.3875 0.9879 1.4135 0.9814 1.4240 0.9945 1.4263 0.9963 1.4377 0.9980 1.4395 sd 0.2435 0.3566 0.1854 0.2913 0.1543 0.2373 0.1336 0.1838 0.1068 0.1588 0.0979 0.1362 0.0573 0.0772 0.0442 0.0614 cv 0.2642 0.2821 0.1935 0.2148 0.1596 0.1710 0.1352 0.1300 0.1088 0.1115 0.0984 0.0955 0.0575 0.0537 0.0443 0.0427 1 m 0.9313 1.2773 0.9569 1.3370 0.9749 1.3978 0.9866 1.4156 0.9869 1.4214 0.9943 1.4302 0.9951 1.4393 0.9983 1.4403 sd 0.2402 0.3595 0.1870 0.2770 0.1601 0.2452 0.1334 0.1866 0.1106 0.1645 0.0935 0.1334 0.0546 0.0785 0.0445 0.0609 cv 0.2579 0.2815 0.1954 0.2072 0.1642 0.1754 0.1352 0.1318 0.1121 0.1157 0.0940 0.0933 0.0549 0.0545 0.0446 0.0423 2 m 0.9263 1.2870 0.9628 1.3701 0.9705 1.3848 0.9802 1.4096 0.9919 1.4248 0.9866 1.4296 0.9982 1.4410 0.9990 1.4405 sd 0.2411 0.3708 0.1801 0.2758 0.1565 0.2446 0.1254 0.1795 0.1105 0.1580 0.0927 0.1325 0.0556 0.0770 0.0434 0.0604 cv 0.2603 0.2881 0.1871 0.2013 0.1613 0.1766 0.1279 0.1273 0.1114 0.1109 0.0940 0.0927 0.0557 0.0534 0.0434 0.0419 10 m 0.9305 1.2870 0.9538 1.3514 0.9749 1.3871 0.9764 1.3982 0.9850 1.4223 0.9848 1.4283 0.9961 1.4372 0.9970 1.4399 sd 0.2314 0.3596 0.1872 0.2810 0.1643 0.2419 0.1299 0.1911 0.1128 0.1578 0.0945 0.1335 0.0571 0.0804 0.0449 0.0603 cv 0.2487 0.2794 0.1963 0.2079 0.1685 0.1744 0.1330 0.1367 0.1145 0.1109 0.0960 0.0935 0.0573 0.0559 0.0450 0.0419 20 m 0.9146 1.2649 0.9577 1.3447 0.9760 1.3933 0.9780 1.4161 0.9872 1.4199 0.9899 1.4278 0.9963 1.4402 0.9956 1.4387 sd 0.2308 0.3708 0.1943 0.2795 0.1558 0.2323 0.1206 0.1832 0.1081 0.1628 0.0959 0.1350 0.0566 0.0772 0.0434 0.0632 cv 0.2524 0.2932 0.2029 0.2079 0.1596 0.1667 0.1233 0.1294 0.1095 0.1147 0.0969 0.0946 0.0568 0.0536 0.0436 0.0439 Fig. 4: Histogram and boxplot for CVc and CVr for Exp(λ=1) with n=100 Fig. 5: Histogram and Boxplot for CVc and CVr for Exp(λ=1) with n=500 62

Fig. 6: Skewness coefficient of CVc as n grows large. This behavior pattern towards symmetricity of the sampling distributions of CVc and CVr is best observed in the following line graphs shown in Fig. 6 and 7. The information in Table 1 in the Appendix showed that for each value of λ, although the sampling distribution of CV r is more dispersed than that of CV c for smaller n, it improves (based on conventional cv of each of samples generated) as n increases, that is having similar or smaller dispersion than that of the sampling distribution of CV c. Similarity or smaller dispersion of the sampling distribution of CV r compared to CV c, is achieved faster for smaller values of λ. The kurtosis coefficients for the sampling distributions are also plotted for each form of cv and are as shown in Fig. 8 and 9. At all λ values and for all n, the sampling distribution of CVr is less peaked than that of CVc with the sampling distribution of CVr attaining kurtosis coefficients closer to 3.0 (that of the normal distribution). Fig. 7: Skewness coefficient of CVr. (Sample Size: 1-n=10 2-n=20 3-n=30 4-n=50 5-n=70 6- n=100 7-n=300 8-n=500) Fig. 8: Kurtosis coefficients of the sampling distribution of CVc 63 SUMMARY OF FINDINGS The simulation results in the form of histograms, boxplots, table of descriptive statistics and graphs of skewness and kurtosis coefficients for the conventional cv, CV c and robust form, CV r, for each combination of λ=0.05, 0.1, 0.2, 1, 2, 10, 20 and n=10, 20, 30, 50, 70, 100, 300, 500, showed that the sampling distribution of CV c is right-skewed for small samples and tends to approximate symmetricity as the sample size grows large, while the sampling distribution of CV r is left skewed for small samples and tends to very close symmetricity, in fact more symmetric compared to CV c, for larger sample sizes. Also as n increases from 10 to 500, the sampling distribution of CV r maintain smaller kurtosis than that of CV c,, approaching faster and closer to 3.0 compared to that of CV c. This is so for all values of λ. With respect to dispersion, for all values of λ the sampling distribution of CV r is more dispersed for smaller values of n but is about the same or smaller than that of CV c as n increase. Dispersion of the sampling distributions of CV c and CV r becomes comparable faster for smaller λ values. Large sample symmetricity and normality, shown to prevail in the sampling distribution of CV r for samples from the exponential distribution will be valuable as this will allow the standardized form of CV r to achieve standard normality for large n, a property which will enable test of hypotheses and calculation of probabilities of events involving CV r using the existing standard normal table.

Fig. 9: Kurtosis coefficients of the sampling distribution of CVr FURTHER RESEARCH ON CV r This research work on CV r may be extended in many directions, among which we are considering the following activities: 1. Compare the behavior of an unbiased conventional, UCV c = [1] to that of CV r. 2. Use of both graphical and formal quantitative tests for goodness-of-fit and application of distribution fitting softwares such as SUREFIT and EXPERTFIT on the sampling distribution of the standardized random variable for purposes of testing hypotheses and calculation of approximate probabilities on CV r. 3. Simulation study on the robustness of CV r for exponential samples by investigating effects of outliers. 4. Similar study on CV r for samples from other statistical distributions: continuous distributions such as the Weibull, lognormal, logistic and inverse Gaussian, as well as discrete distributions such as the Poisson and binomial. REFERENCES 1. Hurlimann, W., 1994. A uniform approximation of the sampling distribution of the coefficient of variation. Stat. & Probab. Letters., 24: 263-268. 2. Cox, J.C. and V. Sadiraj, 2011. On the coefficient of variation as a measure of risk sensitivity. http://aysps.gsu.edu/working-papers.html 64 3. Johnson, N.L. and S. Kotz, 1970. Continuous Univariate Distributions-1. John Wiley and Sons, Inc., New York. USA. 4. Reed, G.F., F. Lynn and B.D. Meade, 2003. Use of coefficient of variation in assessing variability in quantitative assays. Clin. Diagn. Immunol., 6(3): 1162-1170. 5. Crawley, M.J., 2007. The R Book. John Wiley & Sons Ltd. West Sussex, England. 6. Reed, G.F., F. Lynn and B.D. Meade, 2003. Use of coefficient of variation in assessing variability in quantitative assays. Clin. Diagn. Immunol., 6(3): 1162-1170. 7. Mahmoudvand, R., H. Hassani and R. Wilson, 2007. Is the sample coefficient of variation a good estimator for the population coefficient of variation. http://mpra.ub.uni-muenchen.de/6106/. 8. Curto, J.D. and J.C. Pinto, 2009. The coefficient of variation asymptotic distribution in the case of i.i.d random variables. Journal of Applied Stat., 36: 21-32. 9. Breunig, R., 1996. An almost unbiased estimator of the coefficient of variation. Elsevier Economics Letter, pp: 15-19. 10. Dutter, R., P. Filzmoser and R.G. Garrett, 2008. Statistical Data Analysis Explained. John Wiley & Sons Ltd. West Sussex, England. 11. Pang, W.-K., P.-K. Leung, W.-K. Huang and W. Liu, 2005. On interval estimation of the coefficient of variation for three-parameter Weibull. Lognormal and gamma distributions: A simulationbased approach. European Journal of Operations Research, pp: 367-377. 12. He, X. and S.O. Oyadiji, 2002. Application of coefficient of variation in reliability-based mechanical design and manufacture. Journal of Materials Processing Technology, 119 (1-3): 374-378.

13. Martin, J.D. and L.N. Gray, 1971. Measurement of relative variation: Sociological examples. American Sociological Review. 36 (3): 496-502. 14. Hendricks, W.A. and K.W. Robey, 1936. The sampling distribution of the coefficient of variation. The Annals of Math. and Stat., 7 (3): 129-132. 15. Olive, D.J., 2007. Applied Robust Statistics. Southern Illinois University, Carbondale, USA. 16. Herve, A., 2010. Coefficient of Variation. In Neil Salkind (Ed.). Encyclopedia of Research Design, Thousand Oakes, Georgia, USA, pp: 1-5. 17. Murari, S., 1993. Behavior of sample coefficient of variation drawn from several populations. Indian Journal of Stat., 55 (B): 65-76. 65