SKEWNESS AND KURTOSIS

Size: px

Start display at page:

Download "SKEWNESS AND KURTOSIS"

Marilynn Katrina Walton
6 years ago
Views:

1 UNT 3 SKEWNESS AND KURTOSS Structure 3. ntroduction Objectives 3.2 Moments and Quantiles Moments of a Frequency Distribution Quantiles of a Frequency Distribution 3.3 Skewness 3.4 Kurtosis 3.5 Summary 3.6 Solutions and Answers 3. NTRODUCTON n Unit 2, we talked about the central tendency and dispersion of frequency distributions. We have also seen how to compute some measures of central tendency and dispersion. Now, in this unit, we shall discuss two additional features of frequency distributions. These are : skewness and kurtosis. A measure of skewness would tell us how far the frequency curve of the given frequency distribution deviates from a symmetric one. On the other hand, a measure of kurtosis gives us some information about the degree of flatness (or peakedness) of the frequency curve. So, these two features, along with the two discussed in the previous unit should give us a good idea about the given frequency distribution. The measures of skewness and kurtosis that we are going to discuss here, make use of moments and quantiles. So, we shall first introduce these in Section 3.2. While studying this unit, you will need to look back at the tables of data given in Unit. You will also need your calculator. Check all the calculations in the solved examples, so that you don't have any difficulty in solving the exercises later. With this unit, we end our discussion of the descriptive measures of univariate data. n the next unit, Unit 4, we'll talk about bivariate data, i.e., data concerning two variables. Objectives After reading this unit, you should be able to : calculate the momenrs and the quantiles of a given frequency distribution, compute some measures of skewness and kurtosis, discuss the relative advantages and disadvantages of these measures. 3.2 MOMENTS AND QUANTLES As we have mentioned in the ntroduction, using moments and quantiles, we can define some measures of skewness and kurtosis. So we can say that moments and quantiles give us some information about the nature of a given frequency distribution. You will see that the mean of a frequency distribution can also be considered 'as its moment, whereas the median can be considered as its quantile. Now let's study these one by one Moments of a Frequency Distribution f you have studied a little physics, you would have come across the word, "moment". Moment, in physics, measures the tendency of a force to produce rotation. f there

2 are n forces, fl, f2,....., fn acting at aistances xl, x2,...., x,, from the on,. then the moment of the total force is Now isn't this a familiar expression? f in (), we take fi to be the frequency of xi, i=l, 2,......, n, then () also gives us the mean of the distribution of xvalues. t is because of this similarity, that the term, "moment" has found its way into Statistics. Now let's try to understand this term in the context of Statistics. You know that the mean and the variance of data on a viable x are given by k f x = fixi n ==Xfi(xi 0). and 2 n i=l s = 2 fi (xi i)', respectively. Now, taking a cue from this, if A is any number, we define the rtb moment of x about A to be the mean of the rth power of the deviations of x from A. We denote this by m:(a), or simply mi, if there is no confusion about the origin chosen. Thus, m:(a) = fi i=l (xi A)' " For raw data, we can write m:(a) = (xi A)' f we take A = %, then we get what are called the central moments. The rth central ' moment is denoted by m,. Thus For raw data, m, = 8(xi 3'. We are sure you would agree with. the following : m; = m, =... (4) m;(o) = x... (5) qtn... (6) m2 = s2... (7) (5) and (7) say that the mean of a variable is its nrst moment about zero, while the variance is its second central moment. Recall that we have already established (6) in Sectior! Now let us try to establish a connection between central moments and moments about any value A'. For simplicity we consider raw data only. For every i, we have

3 Therefore, using binomial theorem, we get (xi x )~ = (xi Ap ( ) (xi AT' mi + (xi ~l (mi)' f we sum both sides over all i, i=,2,..., n, and divide the, result by n, we get m, = mi (;)mi,mi + (;)rn:,(m;)'... + Now let us take r =. n this case, (8) becomes m, =mi (:)mi =o. We have already stated this in (6). Now, if we put r = 2 in (8), we get This gives us m2 = m; ( :) (.mil2 + ( ) (mi)' mz = rn; (mi)' Putting r = 3 in (8) gives us m, = m; 3mi mi + 2(rni3 Check that if you put r=4 in (a), you get m4 = m; 4m; mi + 6mi (mi)' 3(mil4 Further, we have jt = A + mi We can use these relationships between the central moments and the moments about any A, for simplifying the calculations involved in the computation of central moments. Here is an example to show how this is done. Example : Let's evaluate the mean and central moments of the milk yield data (in litres) of a dairy farm first used in Example in Unit 2. Let us take A = 200 litres and first obtain the moments about A. The values of u = xa and of the squares, cubes and fourth powers of u are shown in the table below. The last column is taken to serve as a check on the calculations, as you will see later. We choose A to be a value near the centre of the range of values of x. U U: 33.; , Total u: b L (~,+)' t The last row of the table shows that n=0,

4 Now to check these calculations, we make use of the last column. We have But X(ui + )' also equals xu: u: + 6 xu: + 4 Xui + n n Since we get the same value for 2 (ui + )' by these two different methods, we can be sure that the computations are correct. Now 0.7 mi = =.07, 0 mi = Hence x = mi = 20.07itres, (from (2)). m2 = mi (mi)2 (from (9)). = = itre re)^, m3 = mi 3mimi + 2(mi3 (from (0)) = x x.225 = r' (litre3s and m4 = mi 4m;mi + 6mi(mi)2 3(mi)' (from ()) = X X X.3 = = (litre)4s. Try to do this exercise now. The result in this exercise also helps to simplify the computation of central moments. El) Suppose the data are subjected to a change of both origin and scale, i.e., let " f V; = uf, show that Notice how we have used this result in simplifying the computation of the mean and central moments in our next example. Exmgle 2 : For the frequency table of petiole length of leaves of a pipal tree, let us take u = (x 4.95)/0.8. The table below shows the steps to be followed in obtaining

5 vi, v;, v; and vi. Here, again, we take the last column to provide a check on the computations. k We have xfi (ui + )4 = 2320 Hence, we are sure that the column totals are free of errors. We now have v; = 82J98 = vh = = , v; = 48/98 = v; = 706Y98 = The mean and central moments of petiole length are x = x = cm, m, = (0.8)~[ (0.444)~] = 0.64 X =.7892 (cm)' m3 = (0.8)~ = 0.52 x ( ) =.763 ( ~m)~, [ x x x (0.444)~] m4 = (0.8)~[ x x x x (0.444)~ 3 X (0.444)~] = X = We may also encounter an exactly opposite situation, that is, given, the mean and central moments of a variable x, we may like to express the moments about 'some other origin, say A, in terms of these quantities. So, we would like to have some formulas which express each m: in terms of the central moments. We believe you are now in a position to derive the required formulas. E2) Prove that m: = m, + ( ; ) m,, d + ( ; ( ri2 ) m2dr' + ( :.) dr, where d = x A. Hint : Use the fact, xi A = (xi$ +.(%A). ) m,, d

6 E3) n Sections and 2.4.4, you have seeit formulas for the composite mean and composite variance in terms of the group means and group variances, when several groups of data on a variable are taken together. Obtain similar formulas for the composite third and fourth central moments, using E 2. Let us now turn our attention to quantiles Quantiles of a Frequency Distribution Remember how we define median of a given set of data? t is a value x, such that at most half of the observations are below x, and at most half are above x. Now, instead of half, if we take a proportion p, 0 < p <, then we get a pquantile. Thus, by the pquantile (or pfractile, or quantile of order p,) of a variable x, we mean a value, say z,, of the variable such that at most a proportion p of the observation! is below zp and at most a proportion (p) is above zp. SO, the median is the quantile 3 of order. With p = T v and we have the three quartiles z,,, z,,, and z,,, 2 (which are also denoted by ql, q2 and q3). Taking p = 0.,0.2,......,0.9, we get the nine deciles and with p = 0.0, 0.02,......, 0.99, we get the ninetynine percentiles Like the median, the pquantile for a set of observations may not be unique. Now let's see how to compute this pquantile. When a frequency table for a continuous variable is given, we first decide which of the classintervals contains 2,. f x, and x are its lower and upper boundaries and F, and Fu the corresponding cumulative frequencies, then the pquantile z, may be determined approximately by using the formula zx npf, P = Xu X Fu F, where cis the width of the interval and f, is its frequency. n (3), if we put p = 2' we get the formula for the median. Now let's use Formula (3) to compute the quartiles in an example. Example 3 : For the frequency distribution of petiole length of 98!eaves of a pipa tree, the median (i.e., the second quartile) was evaluated in Example 5 of Unit 2. We'll now find the first and third quartiles. Here 3n = 49.5 and = On going through the cumulative frequency table (Table 8, Unit ), we find that q, lies in the interval (cm) and q3 in the interval (cm). So, after putting the appropriate values from Table 7 and 8 of Unit in (3), we get o.8 ql = = cm o.8 q3 = = 6.58 cm. See if you can solve this.exercise now. E4) Find the three quartiles for the agedistribution of the ndian population according to the 98 census (given in Unit 2). Check that q2 q, < q3 q2. f we are given the first few moments or a small set of quantiles of a frequency distribution, we can get a fairly good idea about the distribution. n fact, for most purposes, it will be enough to state the values of Z, m2, m3 and m, or those of the three quartiles (or of the nine deciles).

7 We'll come back to this later. Now we~iptroducc a very useful concept, that of weighted mean. Weighted Mean Consider this situation. Students are admitted to a B.Sc. course in Statistics on the basis of their performance in the Higher Secondary, or an equivalent examination. Then don't you think that their scores on the mathematics papers should be considered more important than those on the physics papers? Similarly, shouldn't the scores on language papers be considered least important? t is necessary in such a situation to take into account the relative importance (or weight) of the different observations while evaluating the mean. Suppose wi 3 0 is the weight attached to the value xi (i.e., to the value of x for the ith individual). Then the appropriate mean would be n The measure is called a weighted mean of x. This concept is particularly useful in economic studiesin the construction of a price index number. This will become clear from the next example.' Example 4 : The price increases from 985 to 989 for five food items have been (in percentage terms) as follows: f the figures given below indicate the relative importance of these items in a typical citizen's diet, , then the average price increase for these items should be taken to be 3626' per cent. 00 Observe that the formula that we used to compute the mean of x from grouped data is nothing but the weighted mean of x,,..., x,, the respective frequencies now serving as the weights. Try this exercise now: E5) A student gets 85, 76 and 82 marks in the three tests for the course MTE. She gets 79 marks in the final examination. What'are her average marks if the weightage given to the tests and the final examination are 0, 0, 0 and 70, respectively? n the next two sections, we'll see how moments and quantiles lead to measures of skewness and kurtosis of a frequency distribution. 3.3 SKEWNESS n Unit, you saw that frequency distributions may be classified as symmetrical and skewed (or asymmetrical). Skewed distributions can again be classified as positively skewed or negatively skewed, according as4the longer tail of the distribution is towards the higher or the lower values of the variable (see Fig. ).

8 Fig. : (i) Symmetrical (ii) positively skewed and (iii) negatively skewed distributions. Now, the degree of skewness is the extent to which the given distribution departs from symmetry. A good measure of the degree of skewness has to fulfil the following criteria: i) t should be a pure number, i.e., should be free of the units in which the variable is measured. ii) t should be zero, positive and negative for a symmetrical distribution, a positively skew distribution and a negatively skew distribution, respectively. iii) t should vary between two definite limits, say, k and +k, as the nature of a distribution changes from extreme negative asymmetry to extreme positive asymmetry. Here are some commonly used measures (assuming s >0) : As usual, Tt denotes the mean, x denotes the median. denotes the mode. q,, the ith quartile, m3, the third moment, about it and s, the standard deviation. and Sk, = 3 s3 Skl may not be defined, since the mode Tay not be defined. To get over this difficulty, we use the empirical relation SEx : 36 2) to get the measure Sk2. Sk2 and Sk3 too, may not be unique since the median and the first and third quartiles may M: not be unique. The square of ~k,,~k; = is called Pearson's coefficient and is denoted by bl. M; All the four measures above are free of units. Secondly, for a symmetrical (unimodal) 8 distribution, and mean = median mode q3 q2 = q2 ql Further, for most of the distributions which we see in practice, we have the following two observations: ) For a positively skew distribution, 'mean > median > mode, and q3 q2 > q2 4 2) F0r.a negatively skew distribution, mean < median < mode, and qj q2 < q2 ql. See Fig. 2(a) and (b).

9 Variable value Fig. 2 Variable value For a symmetrical distribution, m3 (or for that matter, any oddorder central moment) is zero. m3 > 0 for a positively skew distribution, and m3 < 0 for a negatively skew distribution. Hence, all the four measures meet the second criterion. As to the third criterion, we have the following results : i) For any distribution with s>o, 3 S Sk2 S + 3. ii) For any distribution, S Sk3 S +. Thus, Sk, and Sk3 meet the third criterion. Since we have the empirical relation. mean mode = 3(mean median), we can say that Sk,, too, roughly meets this criterion. However, Sk4 = &may take any value between GO and + m and hence, is inferior to the other measures. Let's now calculate these four measures for the data on petiole length. Example 5 : For the frequency distribution of petiole length, we have x = 5.28 cm, k (=q2) = 5359 cm, = cm. Also, for this distribution, the first and third quartiles are q, = cm, q3 = cm, while the standard deviation and third central moment are Hence. s =.456 cm, m3 = '.76 ( ~m)~ 0.029, aid Sk4 = = (.456) All these values indicate that the distribution is only slightly asymmetric, and that it is a case of negative asymmetry. This is also apparent from the histogram of the distribution (see Fig. 4 in Unit ). Now here is an exercise for you. E6) Show that, for a distribution symmetrical about a, the mean as well as the median is a and the central moments are all equal to zero. i Hint : You may take the values of x to be, say a f hl, a f h2,..., a f h, (hi 3 0 for each i) with frequepcies "; for a hi and also for a + hi.

10 Now that we have seen how to measure the skewness of afrequency distribution, let us talk about its kurtosis. 3.4 KURTOSS We now focus attention on another feature of a frequency distribution that determines the shape of the distribution. t is the degree of steepness or pointedness of distributionor, to use a Greek work, the kurtosis of the distribution. Some distributions are flattopped; some are highly peaked; most distributions will be in between these two extreme types, not too peaked and not too flattopped either. n Fig. 3, we have the frequency curves of a distribution that is highly peaked, one that is of moderate kurtosis and a third one which is rather flattopped. Fig. 3 : Three symmetrical distributions with same mean and s.d. but of varying kurtosis. t has been observed that for two distributions having the same dispersion and the same degree of skewness, the one with higher kurtosis, usually has higher fourth powers of deviations from the mean and hence a higher value of m,. This observation is used to define a measure of kurtosis (under the assumption that s > 0) as follows : meso means moderate, lepta means thin and platy mean flat. The division by s4 makes the measure free of units. t also ensures that the measure takes into account that part of the peakedness of the distribution which is independent of (or is in addition to) the part that is due to the variance. For a normal distribution (about which you will learn in Block 3), b2 = 3. This value is taken as a standard against which the kurtosis of other distributions is judged. Any distribution with b2 = 3 is called mesokurtic (i.e., of moderate kutosis); one with, b2> 3 is said to be leptokurtic, while one with b2 < 3 is said to be platykurtic. Thus, in Fig. 3, (i) is lepokurtic, (ii) is mesokurtic, while (iii) is platykurtic. For any univariate distribution with s > 0, we have b2 3. Let's prove this. (xi 5) Proof : Let ui = for each i. Then xui= ;Z(? x) = O ' S

11 and iui=l$(xih2 s2 Skewness and Kurtasiis Also, we must have n 2 (u: )2 2 0, since the L.H.S. is a sum of squares. This means. xuf 2 z u: + n or x nm 2n.+ n 2 0, s4 or n(b2 ) 20. Since n > 0, this implies that b2 3 0, so Zhat the result is established. Note that we have b2 = iff n(b2 ) = 0. n So we can also say that b2 = iff (uf )' = 0, i.e., iff uf = for each i, i.e., iff xi = 2 + s for each i. Thus, the coefficient of kurtosis b2 =, iff the variable x can assume just two values with equal frequencies (so that the mean may be exactly midway between the two values). Now let's calculate the coefficient of kurtosis for our favourite distribution. Example 6 : For the frequency distribution of petiole length, we have, from Example 2, m2 =.7892 (cm2, m4 = ( ~m)~. Hence, the kurtosis coefficient b2 for this distribution is given by b2 = 5.535/(.7892)2 = So, we find that this distribution is slightly leptokurtic. Try to do this exercise now. E7) Comment on the skewness and kurtosis of the agedistribution of the ndian population (98 census) given in E3) in Unit 2, by evaluating appropriate coefficients and also by considering the histogram of the distribution. Let us now summarise the points covered in this unit. 3.5 SUMMARY n this unit, you have seen ) what is meant by inoments and quantiles of different orders about A: mi = X f (xi A)' for data in the form of a frequency distribution n i=l l and mi = 2 (xi A)' for raw data. n

12 2) how central moments, i.e., moments about x are related to moments about any A: 3) when weighting of the observations would be appropriate for the computation of mean, 4) what is meant by skewness and kurtosis : skewness is the departure from symmetry and kurtosis gives the degree of flatness of a frequency distribution. 5) how to compute measures of skewness and kurtosis of a frequency distribution : Measures of Skewness : Sk3 Sk4 = % 9 provided s f 0. q + (q, Measure of kurtosis : m b2=97(s#o). s4 3.6 SOLUTONS AND ANSWERS " n El) v; = 2 ui p 2 ' Similarly, solve for m3 and m,. E2) xi A = (xi Z) + (5 A) = (xi %) + d, say.'. (xl A)' = (xi X)r + (,; ) (xi X)r2 d2 On summing over i and dividing the result by n, we get where d =?A. ~ L

13 i Suppose there are k sets of data, each having nj observations and mean Ej, j=, 2,..., k. Then 32 (xi Zj) (f, f)2 + x 6, i" J Similarly, since 2 (xi x,) = 0 i E4) ql = 9.4 years, q2 = years, q3 = years. E5) weigh;ed mean = 0 x x x x E6) The first moment about a is k k mi = i [ ~ [ ( a n + hi) a ] fi +?[(a hi) a fi = [$(hi hi) fi] = 0 0 f a is not a possible value of x, then the total irequency of values less than a equals the total frequency of values exceeding a. Hence a is the median. f a is a possible value (occurring with frequency f,, say), then also, total frequency below a equals total frequency above a, hence a is again the median. E7) The moments about 27.5 (years) are mi = x 5 mi = x 5', m; = x 5', mi = x 54. Hence x = 25.06, m, = x s2, m, = x s3, m4 = X 54 * &= 0.743, b2 = The distribution is moderately skew and platykurtic. This is also indicated by the histogram.

Moments and Measures of Skewness and Kurtosis

Moments and Measures of Skewness and Kurtosis Moments The term moment has been taken from physics. The term moment in statistical use is analogous to moments of forces in physics. In statistics the values