SKEWNESS AND KURTOSIS
|
|
- Marilynn Katrina Walton
- 6 years ago
- Views:
Transcription
1 UNT 3 SKEWNESS AND KURTOSS Structure 3. ntroduction Objectives 3.2 Moments and Quantiles Moments of a Frequency Distribution Quantiles of a Frequency Distribution 3.3 Skewness 3.4 Kurtosis 3.5 Summary 3.6 Solutions and Answers 3. NTRODUCTON n Unit 2, we talked about the central tendency and dispersion of frequency distributions. We have also seen how to compute some measures of central tendency and dispersion. Now, in this unit, we shall discuss two additional features of frequency distributions. These are : skewness and kurtosis. A measure of skewness would tell us how far the frequency curve of the given frequency distribution deviates from a symmetric one. On the other hand, a measure of kurtosis gives us some information about the degree of flatness (or peakedness) of the frequency curve. So, these two features, along with the two discussed in the previous unit should give us a good idea about the given frequency distribution. The measures of skewness and kurtosis that we are going to discuss here, make use of moments and quantiles. So, we shall first introduce these in Section 3.2. While studying this unit, you will need to look back at the tables of data given in Unit. You will also need your calculator. Check all the calculations in the solved examples, so that you don't have any difficulty in solving the exercises later. With this unit, we end our discussion of the descriptive measures of univariate data. n the next unit, Unit 4, we'll talk about bivariate data, i.e., data concerning two variables. Objectives After reading this unit, you should be able to : calculate the momenrs and the quantiles of a given frequency distribution, compute some measures of skewness and kurtosis, discuss the relative advantages and disadvantages of these measures. 3.2 MOMENTS AND QUANTLES As we have mentioned in the ntroduction, using moments and quantiles, we can define some measures of skewness and kurtosis. So we can say that moments and quantiles give us some information about the nature of a given frequency distribution. You will see that the mean of a frequency distribution can also be considered 'as its moment, whereas the median can be considered as its quantile. Now let's study these one by one Moments of a Frequency Distribution f you have studied a little physics, you would have come across the word, "moment". Moment, in physics, measures the tendency of a force to produce rotation. f there
2 are n forces, fl, f2,....., fn acting at aistances xl, x2,...., x,, from the on,. then the moment of the total force is Now isn't this a familiar expression? f in (), we take fi to be the frequency of xi, i=l, 2,......, n, then () also gives us the mean of the distribution of xvalues. t is because of this similarity, that the term, "moment" has found its way into Statistics. Now let's try to understand this term in the context of Statistics. You know that the mean and the variance of data on a viable x are given by k f x = fixi n ==Xfi(xi 0). and 2 n i=l s = 2 fi (xi i)', respectively. Now, taking a cue from this, if A is any number, we define the rtb moment of x about A to be the mean of the rth power of the deviations of x from A. We denote this by m:(a), or simply mi, if there is no confusion about the origin chosen. Thus, m:(a) = fi i=l (xi A)' " For raw data, we can write m:(a) = (xi A)' f we take A = %, then we get what are called the central moments. The rth central ' moment is denoted by m,. Thus For raw data, m, = 8(xi 3'. We are sure you would agree with. the following : m; = m, =... (4) m;(o) = x... (5) qtn... (6) m2 = s2... (7) (5) and (7) say that the mean of a variable is its nrst moment about zero, while the variance is its second central moment. Recall that we have already established (6) in Sectior! Now let us try to establish a connection between central moments and moments about any value A'. For simplicity we consider raw data only. For every i, we have
3 Therefore, using binomial theorem, we get (xi x )~ = (xi Ap ( ) (xi AT' mi + (xi ~l (mi)' f we sum both sides over all i, i=,2,..., n, and divide the, result by n, we get m, = mi (;)mi,mi + (;)rn:,(m;)'... + Now let us take r =. n this case, (8) becomes m, =mi (:)mi =o. We have already stated this in (6). Now, if we put r = 2 in (8), we get This gives us m2 = m; ( :) (.mil2 + ( ) (mi)' mz = rn; (mi)' Putting r = 3 in (8) gives us m, = m; 3mi mi + 2(rni3 Check that if you put r=4 in (a), you get m4 = m; 4m; mi + 6mi (mi)' 3(mil4 Further, we have jt = A + mi We can use these relationships between the central moments and the moments about any A, for simplifying the calculations involved in the computation of central moments. Here is an example to show how this is done. Example : Let's evaluate the mean and central moments of the milk yield data (in litres) of a dairy farm first used in Example in Unit 2. Let us take A = 200 litres and first obtain the moments about A. The values of u = xa and of the squares, cubes and fourth powers of u are shown in the table below. The last column is taken to serve as a check on the calculations, as you will see later. We choose A to be a value near the centre of the range of values of x. U U: 33.; , Total u: b L (~,+)' t The last row of the table shows that n=0,
4 Now to check these calculations, we make use of the last column. We have But X(ui + )' also equals xu: u: + 6 xu: + 4 Xui + n n Since we get the same value for 2 (ui + )' by these two different methods, we can be sure that the computations are correct. Now 0.7 mi = =.07, 0 mi = Hence x = mi = 20.07itres, (from (2)). m2 = mi (mi)2 (from (9)). = = itre re)^, m3 = mi 3mimi + 2(mi3 (from (0)) = x x.225 = r' (litre3s and m4 = mi 4m;mi + 6mi(mi)2 3(mi)' (from ()) = X X X.3 = = (litre)4s. Try to do this exercise now. The result in this exercise also helps to simplify the computation of central moments. El) Suppose the data are subjected to a change of both origin and scale, i.e., let " f V; = uf, show that Notice how we have used this result in simplifying the computation of the mean and central moments in our next example. Exmgle 2 : For the frequency table of petiole length of leaves of a pipal tree, let us take u = (x 4.95)/0.8. The table below shows the steps to be followed in obtaining
5 vi, v;, v; and vi. Here, again, we take the last column to provide a check on the computations. k We have xfi (ui + )4 = 2320 Hence, we are sure that the column totals are free of errors. We now have v; = 82J98 = vh = = , v; = 48/98 = v; = 706Y98 = The mean and central moments of petiole length are x = x = cm, m, = (0.8)~[ (0.444)~] = 0.64 X =.7892 (cm)' m3 = (0.8)~ = 0.52 x ( ) =.763 ( ~m)~, [ x x x (0.444)~] m4 = (0.8)~[ x x x x (0.444)~ 3 X (0.444)~] = X = We may also encounter an exactly opposite situation, that is, given, the mean and central moments of a variable x, we may like to express the moments about 'some other origin, say A, in terms of these quantities. So, we would like to have some formulas which express each m: in terms of the central moments. We believe you are now in a position to derive the required formulas. E2) Prove that m: = m, + ( ; ) m,, d + ( ; ( ri2 ) m2dr' + ( :.) dr, where d = x A. Hint : Use the fact, xi A = (xi$ +.(%A). ) m,, d
6 E3) n Sections and 2.4.4, you have seeit formulas for the composite mean and composite variance in terms of the group means and group variances, when several groups of data on a variable are taken together. Obtain similar formulas for the composite third and fourth central moments, using E 2. Let us now turn our attention to quantiles Quantiles of a Frequency Distribution Remember how we define median of a given set of data? t is a value x, such that at most half of the observations are below x, and at most half are above x. Now, instead of half, if we take a proportion p, 0 < p <, then we get a pquantile. Thus, by the pquantile (or pfractile, or quantile of order p,) of a variable x, we mean a value, say z,, of the variable such that at most a proportion p of the observation! is below zp and at most a proportion (p) is above zp. SO, the median is the quantile 3 of order. With p = T v and we have the three quartiles z,,, z,,, and z,,, 2 (which are also denoted by ql, q2 and q3). Taking p = 0.,0.2,......,0.9, we get the nine deciles and with p = 0.0, 0.02,......, 0.99, we get the ninetynine percentiles Like the median, the pquantile for a set of observations may not be unique. Now let's see how to compute this pquantile. When a frequency table for a continuous variable is given, we first decide which of the classintervals contains 2,. f x, and x are its lower and upper boundaries and F, and Fu the corresponding cumulative frequencies, then the pquantile z, may be determined approximately by using the formula zx npf, P = Xu X Fu F, where cis the width of the interval and f, is its frequency. n (3), if we put p = 2' we get the formula for the median. Now let's use Formula (3) to compute the quartiles in an example. Example 3 : For the frequency distribution of petiole length of 98!eaves of a pipa tree, the median (i.e., the second quartile) was evaluated in Example 5 of Unit 2. We'll now find the first and third quartiles. Here 3n = 49.5 and = On going through the cumulative frequency table (Table 8, Unit ), we find that q, lies in the interval (cm) and q3 in the interval (cm). So, after putting the appropriate values from Table 7 and 8 of Unit in (3), we get o.8 ql = = cm o.8 q3 = = 6.58 cm. See if you can solve this.exercise now. E4) Find the three quartiles for the agedistribution of the ndian population according to the 98 census (given in Unit 2). Check that q2 q, < q3 q2. f we are given the first few moments or a small set of quantiles of a frequency distribution, we can get a fairly good idea about the distribution. n fact, for most purposes, it will be enough to state the values of Z, m2, m3 and m, or those of the three quartiles (or of the nine deciles).
7 We'll come back to this later. Now we~iptroducc a very useful concept, that of weighted mean. Weighted Mean Consider this situation. Students are admitted to a B.Sc. course in Statistics on the basis of their performance in the Higher Secondary, or an equivalent examination. Then don't you think that their scores on the mathematics papers should be considered more important than those on the physics papers? Similarly, shouldn't the scores on language papers be considered least important? t is necessary in such a situation to take into account the relative importance (or weight) of the different observations while evaluating the mean. Suppose wi 3 0 is the weight attached to the value xi (i.e., to the value of x for the ith individual). Then the appropriate mean would be n The measure is called a weighted mean of x. This concept is particularly useful in economic studiesin the construction of a price index number. This will become clear from the next example.' Example 4 : The price increases from 985 to 989 for five food items have been (in percentage terms) as follows: f the figures given below indicate the relative importance of these items in a typical citizen's diet, , then the average price increase for these items should be taken to be 3626' per cent. 00 Observe that the formula that we used to compute the mean of x from grouped data is nothing but the weighted mean of x,,..., x,, the respective frequencies now serving as the weights. Try this exercise now: E5) A student gets 85, 76 and 82 marks in the three tests for the course MTE. She gets 79 marks in the final examination. What'are her average marks if the weightage given to the tests and the final examination are 0, 0, 0 and 70, respectively? n the next two sections, we'll see how moments and quantiles lead to measures of skewness and kurtosis of a frequency distribution. 3.3 SKEWNESS n Unit, you saw that frequency distributions may be classified as symmetrical and skewed (or asymmetrical). Skewed distributions can again be classified as positively skewed or negatively skewed, according as4the longer tail of the distribution is towards the higher or the lower values of the variable (see Fig. ).
8 Fig. : (i) Symmetrical (ii) positively skewed and (iii) negatively skewed distributions. Now, the degree of skewness is the extent to which the given distribution departs from symmetry. A good measure of the degree of skewness has to fulfil the following criteria: i) t should be a pure number, i.e., should be free of the units in which the variable is measured. ii) t should be zero, positive and negative for a symmetrical distribution, a positively skew distribution and a negatively skew distribution, respectively. iii) t should vary between two definite limits, say, k and +k, as the nature of a distribution changes from extreme negative asymmetry to extreme positive asymmetry. Here are some commonly used measures (assuming s >0) : As usual, Tt denotes the mean, x denotes the median. denotes the mode. q,, the ith quartile, m3, the third moment, about it and s, the standard deviation. and Sk, = 3 s3 Skl may not be defined, since the mode Tay not be defined. To get over this difficulty, we use the empirical relation SEx : 36 2) to get the measure Sk2. Sk2 and Sk3 too, may not be unique since the median and the first and third quartiles may M: not be unique. The square of ~k,,~k; = is called Pearson's coefficient and is denoted by bl. M; All the four measures above are free of units. Secondly, for a symmetrical (unimodal) 8 distribution, and mean = median mode q3 q2 = q2 ql Further, for most of the distributions which we see in practice, we have the following two observations: ) For a positively skew distribution, 'mean > median > mode, and q3 q2 > q2 4 2) F0r.a negatively skew distribution, mean < median < mode, and qj q2 < q2 ql. See Fig. 2(a) and (b).
9 Variable value Fig. 2 Variable value For a symmetrical distribution, m3 (or for that matter, any oddorder central moment) is zero. m3 > 0 for a positively skew distribution, and m3 < 0 for a negatively skew distribution. Hence, all the four measures meet the second criterion. As to the third criterion, we have the following results : i) For any distribution with s>o, 3 S Sk2 S + 3. ii) For any distribution, S Sk3 S +. Thus, Sk, and Sk3 meet the third criterion. Since we have the empirical relation. mean mode = 3(mean median), we can say that Sk,, too, roughly meets this criterion. However, Sk4 = &may take any value between GO and + m and hence, is inferior to the other measures. Let's now calculate these four measures for the data on petiole length. Example 5 : For the frequency distribution of petiole length, we have x = 5.28 cm, k (=q2) = 5359 cm, = cm. Also, for this distribution, the first and third quartiles are q, = cm, q3 = cm, while the standard deviation and third central moment are Hence. s =.456 cm, m3 = '.76 ( ~m)~ 0.029, aid Sk4 = = (.456) All these values indicate that the distribution is only slightly asymmetric, and that it is a case of negative asymmetry. This is also apparent from the histogram of the distribution (see Fig. 4 in Unit ). Now here is an exercise for you. E6) Show that, for a distribution symmetrical about a, the mean as well as the median is a and the central moments are all equal to zero. i Hint : You may take the values of x to be, say a f hl, a f h2,..., a f h, (hi 3 0 for each i) with frequepcies "; for a hi and also for a + hi.
10 Now that we have seen how to measure the skewness of afrequency distribution, let us talk about its kurtosis. 3.4 KURTOSS We now focus attention on another feature of a frequency distribution that determines the shape of the distribution. t is the degree of steepness or pointedness of distributionor, to use a Greek work, the kurtosis of the distribution. Some distributions are flattopped; some are highly peaked; most distributions will be in between these two extreme types, not too peaked and not too flattopped either. n Fig. 3, we have the frequency curves of a distribution that is highly peaked, one that is of moderate kurtosis and a third one which is rather flattopped. Fig. 3 : Three symmetrical distributions with same mean and s.d. but of varying kurtosis. t has been observed that for two distributions having the same dispersion and the same degree of skewness, the one with higher kurtosis, usually has higher fourth powers of deviations from the mean and hence a higher value of m,. This observation is used to define a measure of kurtosis (under the assumption that s > 0) as follows : meso means moderate, lepta means thin and platy mean flat. The division by s4 makes the measure free of units. t also ensures that the measure takes into account that part of the peakedness of the distribution which is independent of (or is in addition to) the part that is due to the variance. For a normal distribution (about which you will learn in Block 3), b2 = 3. This value is taken as a standard against which the kurtosis of other distributions is judged. Any distribution with b2 = 3 is called mesokurtic (i.e., of moderate kutosis); one with, b2> 3 is said to be leptokurtic, while one with b2 < 3 is said to be platykurtic. Thus, in Fig. 3, (i) is lepokurtic, (ii) is mesokurtic, while (iii) is platykurtic. For any univariate distribution with s > 0, we have b2 3. Let's prove this. (xi 5) Proof : Let ui = for each i. Then xui= ;Z(? x) = O ' S
11 and iui=l$(xih2 s2 Skewness and Kurtasiis Also, we must have n 2 (u: )2 2 0, since the L.H.S. is a sum of squares. This means. xuf 2 z u: + n or x nm 2n.+ n 2 0, s4 or n(b2 ) 20. Since n > 0, this implies that b2 3 0, so Zhat the result is established. Note that we have b2 = iff n(b2 ) = 0. n So we can also say that b2 = iff (uf )' = 0, i.e., iff uf = for each i, i.e., iff xi = 2 + s for each i. Thus, the coefficient of kurtosis b2 =, iff the variable x can assume just two values with equal frequencies (so that the mean may be exactly midway between the two values). Now let's calculate the coefficient of kurtosis for our favourite distribution. Example 6 : For the frequency distribution of petiole length, we have, from Example 2, m2 =.7892 (cm2, m4 = ( ~m)~. Hence, the kurtosis coefficient b2 for this distribution is given by b2 = 5.535/(.7892)2 = So, we find that this distribution is slightly leptokurtic. Try to do this exercise now. E7) Comment on the skewness and kurtosis of the agedistribution of the ndian population (98 census) given in E3) in Unit 2, by evaluating appropriate coefficients and also by considering the histogram of the distribution. Let us now summarise the points covered in this unit. 3.5 SUMMARY n this unit, you have seen ) what is meant by inoments and quantiles of different orders about A: mi = X f (xi A)' for data in the form of a frequency distribution n i=l l and mi = 2 (xi A)' for raw data. n
12 2) how central moments, i.e., moments about x are related to moments about any A: 3) when weighting of the observations would be appropriate for the computation of mean, 4) what is meant by skewness and kurtosis : skewness is the departure from symmetry and kurtosis gives the degree of flatness of a frequency distribution. 5) how to compute measures of skewness and kurtosis of a frequency distribution : Measures of Skewness : Sk3 Sk4 = % 9 provided s f 0. q + (q, Measure of kurtosis : m b2=97(s#o). s4 3.6 SOLUTONS AND ANSWERS " n El) v; = 2 ui p 2 ' Similarly, solve for m3 and m,. E2) xi A = (xi Z) + (5 A) = (xi %) + d, say.'. (xl A)' = (xi X)r + (,; ) (xi X)r2 d2 On summing over i and dividing the result by n, we get where d =?A. ~ L
13 i Suppose there are k sets of data, each having nj observations and mean Ej, j=, 2,..., k. Then 32 (xi Zj) (f, f)2 + x 6, i" J Similarly, since 2 (xi x,) = 0 i E4) ql = 9.4 years, q2 = years, q3 = years. E5) weigh;ed mean = 0 x x x x E6) The first moment about a is k k mi = i [ ~ [ ( a n + hi) a ] fi +?[(a hi) a fi = [$(hi hi) fi] = 0 0 f a is not a possible value of x, then the total irequency of values less than a equals the total frequency of values exceeding a. Hence a is the median. f a is a possible value (occurring with frequency f,, say), then also, total frequency below a equals total frequency above a, hence a is again the median. E7) The moments about 27.5 (years) are mi = x 5 mi = x 5', m; = x 5', mi = x 54. Hence x = 25.06, m, = x s2, m, = x s3, m4 = X 54 * &= 0.743, b2 = The distribution is moderately skew and platykurtic. This is also indicated by the histogram.
Moments and Measures of Skewness and Kurtosis
Moments and Measures of Skewness and Kurtosis Moments The term moment has been taken from physics. The term moment in statistical use is analogous to moments of forces in physics. In statistics the values
More informationEngineering Mathematics III. Moments
Moments Mean and median Mean value (centre of gravity) f(x) x f (x) x dx Median value (50th percentile) F(x med ) 1 2 P(x x med ) P(x x med ) 1 0 F(x) x med 1/2 x x Variance and standard deviation
More informationMeasures of Central tendency
Elementary Statistics Measures of Central tendency By Prof. Mirza Manzoor Ahmad In statistics, a central tendency (or, more commonly, a measure of central tendency) is a central or typical value for a
More informationSimple Descriptive Statistics
Simple Descriptive Statistics These are ways to summarize a data set quickly and accurately The most common way of describing a variable distribution is in terms of two of its properties: Central tendency
More informationMeasures of Dispersion (Range, standard deviation, standard error) Introduction
Measures of Dispersion (Range, standard deviation, standard error) Introduction We have already learnt that frequency distribution table gives a rough idea of the distribution of the variables in a sample
More informationLectures delivered by Prof.K.K.Achary, YRC
Lectures delivered by Prof.K.K.Achary, YRC Given a data set, we say that it is symmetric about a central value if the observations are distributed symmetrically about the central value. In symmetrically
More informationTerms & Characteristics
NORMAL CURVE Knowledge that a variable is distributed normally can be helpful in drawing inferences as to how frequently certain observations are likely to occur. NORMAL CURVE A Normal distribution: Distribution
More informationPSYCHOLOGICAL STATISTICS
UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc COUNSELLING PSYCHOLOGY (2011 Admission Onwards) II Semester Complementary Course PSYCHOLOGICAL STATISTICS QUESTION BANK 1. The process of grouping
More information1 Exercise One. 1.1 Calculate the mean ROI. Note that the data is not grouped! Below you find the raw data in tabular form:
1 Exercise One Note that the data is not grouped! 1.1 Calculate the mean ROI Below you find the raw data in tabular form: Obs Data 1 18.5 2 18.6 3 17.4 4 12.2 5 19.7 6 5.6 7 7.7 8 9.8 9 19.9 10 9.9 11
More informationSome Characteristics of Data
Some Characteristics of Data Not all data is the same, and depending on some characteristics of a particular dataset, there are some limitations as to what can and cannot be done with that data. Some key
More informationMEASURES OF DISPERSION, RELATIVE STANDING AND SHAPE. Dr. Bijaya Bhusan Nanda,
MEASURES OF DISPERSION, RELATIVE STANDING AND SHAPE Dr. Bijaya Bhusan Nanda, CONTENTS What is measures of dispersion? Why measures of dispersion? How measures of dispersions are calculated? Range Quartile
More information3.1 Measures of Central Tendency
3.1 Measures of Central Tendency n Summation Notation x i or x Sum observation on the variable that appears to the right of the summation symbol. Example 1 Suppose the variable x i is used to represent
More informationDescriptive Statistics
Petra Petrovics Descriptive Statistics 2 nd seminar DESCRIPTIVE STATISTICS Definition: Descriptive statistics is concerned only with collecting and describing data Methods: - statistical tables and graphs
More informationModule Tag PSY_P2_M 7. PAPER No.2: QUANTITATIVE METHODS MODULE No.7: NORMAL DISTRIBUTION
Subject Paper No and Title Module No and Title Paper No.2: QUANTITATIVE METHODS Module No.7: NORMAL DISTRIBUTION Module Tag PSY_P2_M 7 TABLE OF CONTENTS 1. Learning Outcomes 2. Introduction 3. Properties
More informationDESCRIPTIVE STATISTICS
DESCRIPTIVE STATISTICS INTRODUCTION Numbers and quantification offer us a very special language which enables us to express ourselves in exact terms. This language is called Mathematics. We will now learn
More informationDot Plot: A graph for displaying a set of data. Each numerical value is represented by a dot placed above a horizontal number line.
Introduction We continue our study of descriptive statistics with measures of dispersion, such as dot plots, stem and leaf displays, quartiles, percentiles, and box plots. Dot plots, a stem-and-leaf display,
More informationBasic Procedure for Histograms
Basic Procedure for Histograms 1. Compute the range of observations (min. & max. value) 2. Choose an initial # of classes (most likely based on the range of values, try and find a number of classes that
More informationDavid Tenenbaum GEOG 090 UNC-CH Spring 2005
Simple Descriptive Statistics Review and Examples You will likely make use of all three measures of central tendency (mode, median, and mean), as well as some key measures of dispersion (standard deviation,
More information2 DESCRIPTIVE STATISTICS
Chapter 2 Descriptive Statistics 47 2 DESCRIPTIVE STATISTICS Figure 2.1 When you have large amounts of data, you will need to organize it in a way that makes sense. These ballots from an election are rolled
More informationECON 214 Elements of Statistics for Economists
ECON 214 Elements of Statistics for Economists Session 3 Presentation of Data: Numerical Summary Measures Part 2 Lecturer: Dr. Bernardin Senadza, Dept. of Economics Contact Information: bsenadza@ug.edu.gh
More informationECON 214 Elements of Statistics for Economists 2016/2017
ECON 214 Elements of Statistics for Economists 2016/2017 Topic The Normal Distribution Lecturer: Dr. Bernardin Senadza, Dept. of Economics bsenadza@ug.edu.gh College of Education School of Continuing and
More informationThe Normal Distribution & Descriptive Statistics. Kin 304W Week 2: Jan 15, 2012
The Normal Distribution & Descriptive Statistics Kin 304W Week 2: Jan 15, 2012 1 Questionnaire Results I received 71 completed questionnaires. Thank you! Are you nervous about scientific writing? You re
More information9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives
Basic Statistics for the Healthcare Professional 1 F R A N K C O H E N, M B B, M P A D I R E C T O R O F A N A L Y T I C S D O C T O R S M A N A G E M E N T, LLC Purpose of Statistic 2 Provide a numerical
More informationNCSS Statistical Software. Reference Intervals
Chapter 586 Introduction A reference interval contains the middle 95% of measurements of a substance from a healthy population. It is a type of prediction interval. This procedure calculates one-, and
More informationEstablishing a framework for statistical analysis via the Generalized Linear Model
PSY349: Lecture 1: INTRO & CORRELATION Establishing a framework for statistical analysis via the Generalized Linear Model GLM provides a unified framework that incorporates a number of statistical methods
More informationNumerical summary of data
Numerical summary of data Introduction to Statistics Measures of location: mode, median, mean, Measures of spread: range, interquartile range, standard deviation, Measures of form: skewness, kurtosis,
More informationStatistics 114 September 29, 2012
Statistics 114 September 29, 2012 Third Long Examination TGCapistrano I. TRUE OR FALSE. Write True if the statement is always true; otherwise, write False. 1. The fifth decile is equal to the 50 th percentile.
More informationFundamentals of Statistics
CHAPTER 4 Fundamentals of Statistics Expected Outcomes Know the difference between a variable and an attribute. Perform mathematical calculations to the correct number of significant figures. Construct
More informationDESCRIPTIVE STATISTICS II. Sorana D. Bolboacă
DESCRIPTIVE STATISTICS II Sorana D. Bolboacă OUTLINE Measures of centrality Measures of spread Measures of symmetry Measures of localization Mainly applied on quantitative variables 2 DESCRIPTIVE STATISTICS
More informationFrequency Distribution and Summary Statistics
Frequency Distribution and Summary Statistics Dongmei Li Department of Public Health Sciences Office of Public Health Studies University of Hawai i at Mānoa Outline 1. Stemplot 2. Frequency table 3. Summary
More informationUNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION
1. George cantor is the School of Distance Education UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION General (Common) Course of BCom/BBA/BMMC (2014 Admn. onwards) III SEMESTER- CUCBCSS QUESTION BANK
More information2 Exploring Univariate Data
2 Exploring Univariate Data A good picture is worth more than a thousand words! Having the data collected we examine them to get a feel for they main messages and any surprising features, before attempting
More informationDescriptive Statistics
Chapter 3 Descriptive Statistics Chapter 2 presented graphical techniques for organizing and displaying data. Even though such graphical techniques allow the researcher to make some general observations
More information3.5 Applying the Normal Distribution (Z-Scores)
3.5 Applying the Normal Distribution (Z-Scores) The Graph: Review of the Normal Distribution Properties: - it is symmetrical; the mean, median and mode are equal and fall at the line of symmetry - it is
More informationChapter 5: Summarizing Data: Measures of Variation
Chapter 5: Introduction One aspect of most sets of data is that the values are not all alike; indeed, the extent to which they are unalike, or vary among themselves, is of basic importance in statistics.
More informationECON 214 Elements of Statistics for Economists
ECON 214 Elements of Statistics for Economists Session 7 The Normal Distribution Part 1 Lecturer: Dr. Bernardin Senadza, Dept. of Economics Contact Information: bsenadza@ug.edu.gh College of Education
More informationMeasures of Central Tendency: Ungrouped Data. Mode. Median. Mode -- Example. Median: Example with an Odd Number of Terms
Measures of Central Tendency: Ungrouped Data Measures of central tendency yield information about particular places or locations in a group of numbers. Common Measures of Location Mode Median Percentiles
More informationchapter 2-3 Normal Positive Skewness Negative Skewness
chapter 2-3 Testing Normality Introduction In the previous chapters we discussed a variety of descriptive statistics which assume that the data are normally distributed. This chapter focuses upon testing
More informationHypothesis Tests: One Sample Mean Cal State Northridge Ψ320 Andrew Ainsworth PhD
Hypothesis Tests: One Sample Mean Cal State Northridge Ψ320 Andrew Ainsworth PhD MAJOR POINTS Sampling distribution of the mean revisited Testing hypotheses: sigma known An example Testing hypotheses:
More informationMath 227 Elementary Statistics. Bluman 5 th edition
Math 227 Elementary Statistics Bluman 5 th edition CHAPTER 6 The Normal Distribution 2 Objectives Identify distributions as symmetrical or skewed. Identify the properties of the normal distribution. Find
More informationCH 5 Normal Probability Distributions Properties of the Normal Distribution
Properties of the Normal Distribution Example A friend that is always late. Let X represent the amount of minutes that pass from the moment you are suppose to meet your friend until the moment your friend
More information1 Inferential Statistic
1 Inferential Statistic Population versus Sample, parameter versus statistic A population is the set of all individuals the researcher intends to learn about. A sample is a subset of the population and
More informationOverview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution
PSY 464 Advanced Experimental Design Describing and Exploring Data The Normal Distribution 1 Overview/Outline Questions-problems? Exploring/Describing data Organizing/summarizing data Graphical presentations
More informationChapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1
Chapter 3 Numerical Descriptive Measures Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1 Objectives In this chapter, you learn to: Describe the properties of central tendency, variation, and
More informationDATA SUMMARIZATION AND VISUALIZATION
APPENDIX DATA SUMMARIZATION AND VISUALIZATION PART 1 SUMMARIZATION 1: BUILDING BLOCKS OF DATA ANALYSIS 294 PART 2 PART 3 PART 4 VISUALIZATION: GRAPHS AND TABLES FOR SUMMARIZING AND ORGANIZING DATA 296
More informationIntroduction to Descriptive Statistics
Introduction to Descriptive Statistics 17.871 Types of Variables ~Nominal (Quantitative) Nominal (Qualitative) categorical Ordinal Interval or ratio Describing data Moment Non-mean based measure Center
More informationIOP 201-Q (Industrial Psychological Research) Tutorial 5
IOP 201-Q (Industrial Psychological Research) Tutorial 5 TRUE/FALSE [1 point each] Indicate whether the sentence or statement is true or false. 1. To establish a cause-and-effect relation between two variables,
More informationSection-2. Data Analysis
Section-2 Data Analysis Short Questions: Question 1: What is data? Answer: Data is the substrate for decision-making process. Data is measure of some ad servable characteristic of characteristic of a set
More informationModel Paper Statistics Objective. Paper Code Time Allowed: 20 minutes
Model Paper Statistics Objective Intermediate Part I (11 th Class) Examination Session 2012-2013 and onward Total marks: 17 Paper Code Time Allowed: 20 minutes Note:- You have four choices for each objective
More informationKARACHI UNIVERSITY BUSINESS SCHOOL UNIVERSITY OF KARACHI BS (BBA) VI
88 P a g e B S ( B B A ) S y l l a b u s KARACHI UNIVERSITY BUSINESS SCHOOL UNIVERSITY OF KARACHI BS (BBA) VI Course Title : STATISTICS Course Number : BA(BS) 532 Credit Hours : 03 Course 1. Statistical
More informationNOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS
NOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS A box plot is a pictorial representation of the data and can be used to get a good idea and a clear picture about the distribution of the data. It shows
More informationSUMMARY STATISTICS EXAMPLES AND ACTIVITIES
Session 6 SUMMARY STATISTICS EXAMPLES AD ACTIVITIES Example 1.1 Expand the following: 1. X 2. 2 6 5 X 3. X 2 4 3 4 4. X 4 2 Solution 1. 2 3 2 X X X... X 2. 6 4 X X X X 4 5 6 5 3. X 2 X 3 2 X 4 2 X 5 2
More informationUNIT 4 NORMAL DISTRIBUTION: DEFINITION, CHARACTERISTICS AND PROPERTIES
f UNIT 4 NORMAL DISTRIBUTION: DEFINITION, CHARACTERISTICS AND PROPERTIES Normal Distribution: Definition, Characteristics and Properties Structure 4.1 Introduction 4.2 Objectives 4.3 Definitions of Probability
More informationDescriptive Statistics for Educational Data Analyst: A Conceptual Note
Recommended Citation: Behera, N.P., & Balan, R. T. (2016). Descriptive statistics for educational data analyst: a conceptual note. Pedagogy of Learning, 2 (3), 25-30. Descriptive Statistics for Educational
More informationMBEJ 1023 Dr. Mehdi Moeinaddini Dept. of Urban & Regional Planning Faculty of Built Environment
MBEJ 1023 Planning Analytical Methods Dr. Mehdi Moeinaddini Dept. of Urban & Regional Planning Faculty of Built Environment Contents What is statistics? Population and Sample Descriptive Statistics Inferential
More informationConfidence Intervals. σ unknown, small samples The t-statistic /22
Confidence Intervals σ unknown, small samples The t-statistic 1 /22 Homework Read Sec 7-3. Discussion Question pg 365 Do Ex 7-3 1-4, 6, 9, 12, 14, 15, 17 2/22 Objective find the confidence interval for
More informationBasic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 3: April 25, Abstract
Basic Data Analysis Stephen Turnbull Business Administration and Public Policy Lecture 3: April 25, 2013 Abstract Review summary statistics and measures of location. Discuss the placement exam as an exercise
More informationChapter 3. Populations and Statistics. 3.1 Statistical populations
Chapter 3 Populations and Statistics This chapter covers two topics that are fundamental in statistics. The first is the concept of a statistical population, which is the basic unit on which statistics
More informationLecture Week 4 Inspecting Data: Distributions
Lecture Week 4 Inspecting Data: Distributions Introduction to Research Methods & Statistics 2013 2014 Hemmo Smit So next week No lecture & workgroups But Practice Test on-line (BB) Enter data for your
More informationChapter 6 Simple Correlation and
Contents Chapter 1 Introduction to Statistics Meaning of Statistics... 1 Definition of Statistics... 2 Importance and Scope of Statistics... 2 Application of Statistics... 3 Characteristics of Statistics...
More informationContents. An Overview of Statistical Applications CHAPTER 1. Contents (ix) Preface... (vii)
Contents (ix) Contents Preface... (vii) CHAPTER 1 An Overview of Statistical Applications 1.1 Introduction... 1 1. Probability Functions and Statistics... 1..1 Discrete versus Continuous Functions... 1..
More informationChapter 4. The Normal Distribution
Chapter 4 The Normal Distribution 1 Chapter 4 Overview Introduction 4-1 Normal Distributions 4-2 Applications of the Normal Distribution 4-3 The Central Limit Theorem 4-4 The Normal Approximation to the
More informationLecture 6: Non Normal Distributions
Lecture 6: Non Normal Distributions and their Uses in GARCH Modelling Prof. Massimo Guidolin 20192 Financial Econometrics Spring 2015 Overview Non-normalities in (standardized) residuals from asset return
More informationKey Objectives. Module 2: The Logic of Statistical Inference. Z-scores. SGSB Workshop: Using Statistical Data to Make Decisions
SGSB Workshop: Using Statistical Data to Make Decisions Module 2: The Logic of Statistical Inference Dr. Tom Ilvento January 2006 Dr. Mugdim Pašić Key Objectives Understand the logic of statistical inference
More informationMATHEMATICS APPLIED TO BIOLOGICAL SCIENCES MVE PA 07. LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1)
LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1) Descriptive statistics are ways of summarizing large sets of quantitative (numerical) information. The best way to reduce a set of
More informationA LEVEL MATHEMATICS ANSWERS AND MARKSCHEMES SUMMARY STATISTICS AND DIAGRAMS. 1. a) 45 B1 [1] b) 7 th value 37 M1 A1 [2]
1. a) 45 [1] b) 7 th value 37 [] n c) LQ : 4 = 3.5 4 th value so LQ = 5 3 n UQ : 4 = 9.75 10 th value so UQ = 45 IQR = 0 f.t. d) Median is closer to upper quartile Hence negative skew [] Page 1 . a) Orders
More informationNumerical Descriptions of Data
Numerical Descriptions of Data Measures of Center Mean x = x i n Excel: = average ( ) Weighted mean x = (x i w i ) w i x = data values x i = i th data value w i = weight of the i th data value Median =
More informationData that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc.
Chapter 8 Measures of Center Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc. Data that can only be integer
More informationLecture 2 Describing Data
Lecture 2 Describing Data Thais Paiva STA 111 - Summer 2013 Term II July 2, 2013 Lecture Plan 1 Types of data 2 Describing the data with plots 3 Summary statistics for central tendency and spread 4 Histograms
More informationData Distributions and Normality
Data Distributions and Normality Definition (Non)Parametric Parametric statistics assume that data come from a normal distribution, and make inferences about parameters of that distribution. These statistical
More informationProf. Thistleton MAT 505 Introduction to Probability Lecture 3
Sections from Text and MIT Video Lecture: Sections 2.1 through 2.5 http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-041-probabilistic-systemsanalysis-and-applied-probability-fall-2010/video-lectures/lecture-1-probability-models-and-axioms/
More informationT.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION
In Inferential Statistic, ESTIMATION (i) (ii) is called the True Population Mean and is called the True Population Proportion. You must also remember that are not the only population parameters. There
More information2 2 In general, to find the median value of distribution, if there are n terms in the distribution the
THE MEDIAN TEMPERATURES MEDIAN AND CUMULATIVE FREQUENCY The median is the third type of statistical average you will use in his course. You met the other two, the mean and the mode in pack MS4. THE MEDIAN
More information1 Describing Distributions with numbers
1 Describing Distributions with numbers Only for quantitative variables!! 1.1 Describing the center of a data set The mean of a set of numerical observation is the familiar arithmetic average. To write
More informationE.D.A. Exploratory Data Analysis E.D.A. Steps for E.D.A. Greg C Elvers, Ph.D.
E.D.A. Greg C Elvers, Ph.D. 1 Exploratory Data Analysis One of the most important steps in analyzing data is to look at the raw data This allows you to: find observations that may be incorrect quickly
More information34.S-[F] SU-02 June All Syllabus Science Faculty B.Sc. I Yr. Stat. [Opt.] [Sem.I & II] - 1 -
[Sem.I & II] - 1 - [Sem.I & II] - 2 - [Sem.I & II] - 3 - Syllabus of B.Sc. First Year Statistics [Optional ] Sem. I & II effect for the academic year 2014 2015 [Sem.I & II] - 4 - SYLLABUS OF F.Y.B.Sc.
More informationBoth the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need.
Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need. For exams (MD1, MD2, and Final): You may bring one 8.5 by 11 sheet of
More informationMeasures of Center. Mean. 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) Measure of Center. Notation. Mean
Measure of Center Measures of Center The value at the center or middle of a data set 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) 1 2 Mean Notation The measure of center obtained by adding the values
More informationMath 2200 Fall 2014, Exam 1 You may use any calculator. You may not use any cheat sheet.
1 Math 2200 Fall 2014, Exam 1 You may use any calculator. You may not use any cheat sheet. Warning to the Reader! If you are a student for whom this document is a historical artifact, be aware that the
More informationSubject CS1 Actuarial Statistics 1 Core Principles. Syllabus. for the 2019 exams. 1 June 2018
` Subject CS1 Actuarial Statistics 1 Core Principles Syllabus for the 2019 exams 1 June 2018 Copyright in this Core Reading is the property of the Institute and Faculty of Actuaries who are the sole distributors.
More informationChapter 5. Statistical inference for Parametric Models
Chapter 5. Statistical inference for Parametric Models Outline Overview Parameter estimation Method of moments How good are method of moments estimates? Interval estimation Statistical Inference for Parametric
More informationThe normal distribution is a theoretical model derived mathematically and not empirically.
Sociology 541 The Normal Distribution Probability and An Introduction to Inferential Statistics Normal Approximation The normal distribution is a theoretical model derived mathematically and not empirically.
More informationStatistics 511 Supplemental Materials
Gaussian (or Normal) Random Variable In this section we introduce the Gaussian Random Variable, which is more commonly referred to as the Normal Random Variable. This is a random variable that has a bellshaped
More informationReview: Chebyshev s Rule. Measures of Dispersion II. Review: Empirical Rule. Review: Empirical Rule. Auto Batteries Example, p 59.
Review: Chebyshev s Rule Measures of Dispersion II Tom Ilvento STAT 200 Is based on a mathematical theorem for any data At least ¾ of the measurements will fall within ± 2 standard deviations from the
More informationRandom Variables and Probability Distributions
Chapter 3 Random Variables and Probability Distributions Chapter Three Random Variables and Probability Distributions 3. Introduction An event is defined as the possible outcome of an experiment. In engineering
More informationEmpirical Rule (P148)
Interpreting the Standard Deviation Numerical Descriptive Measures for Quantitative data III Dr. Tom Ilvento FREC 408 We can use the standard deviation to express the proportion of cases that might fall
More informationLecture Data Science
Web Science & Technologies University of Koblenz Landau, Germany Lecture Data Science Statistics Foundations JProf. Dr. Claudia Wagner Learning Goals How to describe sample data? What is mode/median/mean?
More informationQuantitative Methods for Economics, Finance and Management (A86050 F86050)
Quantitative Methods for Economics, Finance and Management (A86050 F86050) Matteo Manera matteo.manera@unimib.it Marzio Galeotti marzio.galeotti@unimi.it 1 This material is taken and adapted from Guy Judge
More informationWeek 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.
Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics. Convergent validity: the degree to which results/evidence from different tests/sources, converge on the same conclusion.
More informationCHAPTER 2 Describing Data: Numerical
CHAPTER Multiple-Choice Questions 1. A scatter plot can illustrate all of the following except: A) the median of each of the two variables B) the range of each of the two variables C) an indication of
More informationBiostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras
Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Lecture - 05 Normal Distribution So far we have looked at discrete distributions
More informationStatistics I Chapter 2: Analysis of univariate data
Statistics I Chapter 2: Analysis of univariate data Numerical summary Central tendency Location Spread Form mean quartiles range coeff. asymmetry median percentiles interquartile range coeff. kurtosis
More informationLecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1
Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 6 Normal Probability Distributions 6-1 Overview 6-2 The Standard Normal Distribution
More informationStatistics (This summary is for chapters 17, 28, 29 and section G of chapter 19)
Statistics (This summary is for chapters 17, 28, 29 and section G of chapter 19) Mean, Median, Mode Mode: most common value Median: middle value (when the values are in order) Mean = total how many = x
More informationCABARRUS COUNTY 2008 APPRAISAL MANUAL
STATISTICS AND THE APPRAISAL PROCESS PREFACE Like many of the technical aspects of appraising, such as income valuation, you have to work with and use statistics before you can really begin to understand
More informationChapter 6. The Normal Probability Distributions
Chapter 6 The Normal Probability Distributions 1 Chapter 6 Overview Introduction 6-1 Normal Probability Distributions 6-2 The Standard Normal Distribution 6-3 Applications of the Normal Distribution 6-5
More informationCFA Level I - LOS Changes
CFA Level I - LOS Changes 2018-2019 Topic LOS Level I - 2018 (529 LOS) LOS Level I - 2019 (525 LOS) Compared Ethics 1.1.a explain ethics 1.1.a explain ethics Ethics Ethics 1.1.b 1.1.c describe the role
More information1/2 2. Mean & variance. Mean & standard deviation
Question # 1 of 10 ( Start time: 09:46:03 PM ) Total Marks: 1 The probability distribution of X is given below. x: 0 1 2 3 4 p(x): 0.73? 0.06 0.04 0.01 What is the value of missing probability? 0.54 0.16
More informationCFA Level I - LOS Changes
CFA Level I - LOS Changes 2017-2018 Topic LOS Level I - 2017 (534 LOS) LOS Level I - 2018 (529 LOS) Compared Ethics 1.1.a explain ethics 1.1.a explain ethics Ethics 1.1.b describe the role of a code of
More informationSection3-2: Measures of Center
Chapter 3 Section3-: Measures of Center Notation Suppose we are making a series of observations, n of them, to be exact. Then we write x 1, x, x 3,K, x n as the values we observe. Thus n is the total number
More information