ECON 214 Elements of Statistics for Economists

Similar documents
ECON 214 Elements of Statistics for Economists

3.1 Measures of Central Tendency

Chapter 3 Descriptive Statistics: Numerical Measures Part A

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1

Measures of Dispersion (Range, standard deviation, standard error) Introduction

Numerical Measurements

Engineering Mathematics III. Moments

MEASURES OF DISPERSION, RELATIVE STANDING AND SHAPE. Dr. Bijaya Bhusan Nanda,

PSYCHOLOGICAL STATISTICS

Simple Descriptive Statistics

CHAPTER 2 Describing Data: Numerical

Description of Data I

Measures of Central tendency

Descriptive Statistics

9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives

ECON 214 Elements of Statistics for Economists 2016/2017

Standardized Data Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis

DESCRIPTIVE STATISTICS

Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution

Measure of Variation

David Tenenbaum GEOG 090 UNC-CH Spring 2005

Population Mean GOALS. Characteristics of the Mean. EXAMPLE Population Mean. Parameter Versus Statistics. Describing Data: Numerical Measures

SUMMARY STATISTICS EXAMPLES AND ACTIVITIES

Descriptive Statistics

DATA SUMMARIZATION AND VISUALIZATION

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

Dot Plot: A graph for displaying a set of data. Each numerical value is represented by a dot placed above a horizontal number line.

MBEJ 1023 Dr. Mehdi Moeinaddini Dept. of Urban & Regional Planning Faculty of Built Environment

Measures of Center. Mean. 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) Measure of Center. Notation. Mean

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

VARIABILITY: Range Variance Standard Deviation

Chapter 4 Variability

Chapter 2: Descriptive Statistics. Mean (Arithmetic Mean): Found by adding the data values and dividing the total by the number of data.

Numerical Descriptions of Data

Basic Procedure for Histograms

Some Characteristics of Data

Applications of Data Dispersions

Section-2. Data Analysis

STAT 113 Variability

Fundamentals of Statistics

Statistics vs. statistics

Chapter 3-Describing Data: Numerical Measures

Chapter 3. Descriptive Measures. Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 3, Slide 1

2 DESCRIPTIVE STATISTICS

Statistics 114 September 29, 2012

Measures of Central Tendency: Ungrouped Data. Mode. Median. Mode -- Example. Median: Example with an Odd Number of Terms

The Range, the Inter Quartile Range (or IQR), and the Standard Deviation (which we usually denote by a lower case s).

1 Exercise One. 1.1 Calculate the mean ROI. Note that the data is not grouped! Below you find the raw data in tabular form:

Handout 4 numerical descriptive measures part 2. Example 1. Variance and Standard Deviation for Grouped Data. mf N 535 = = 25

Chapter 4-Describing Data: Displaying and Exploring Data

Copyright 2005 Pearson Education, Inc. Slide 6-1

Social Studies 201 January 28, 2005 Measures of Variation Overview

2 Exploring Univariate Data

ECON 214 Elements of Statistics for Economists 2016/2017

Categorical. A general name for non-numerical data; the data is separated into categories of some kind.

MEASURES OF CENTRAL TENDENCY & VARIABILITY + NORMAL DISTRIBUTION

Review of the Topics for Midterm I

Chapter 3. Lecture 3 Sections

Chapter 5: Summarizing Data: Measures of Variation

GOALS. Describing Data: Displaying and Exploring Data. Dot Plots - Examples. Dot Plots. Dot Plot Minitab Example. Stem-and-Leaf.

DESCRIPTIVE STATISTICS II. Sorana D. Bolboacă

A LEVEL MATHEMATICS ANSWERS AND MARKSCHEMES SUMMARY STATISTICS AND DIAGRAMS. 1. a) 45 B1 [1] b) 7 th value 37 M1 A1 [2]

Frequency Distribution and Summary Statistics

STATISTICS STUDY NOTES UNIT I MEASURES OF CENTRAL TENDENCY DISCRETE SERIES. Direct Method. N Short-cut Method. X A f d N Step-Deviation Method

Master of Science in Strategic Management Degree Master of Science in Strategic Supply Chain Management Degree

Section3-2: Measures of Center

Descriptive Analysis

Chapter 4-Describing Data: Displaying and Exploring Data

Moments and Measures of Skewness and Kurtosis

STATS DOESN T SUCK! ~ CHAPTER 4

DATA HANDLING Five-Number Summary

Contents. An Overview of Statistical Applications CHAPTER 1. Contents (ix) Preface... (vii)

KARACHI UNIVERSITY BUSINESS SCHOOL UNIVERSITY OF KARACHI BS (BBA) VI

Lecture 1: Review and Exploratory Data Analysis (EDA)

1 A Brief History of. Chapter. Risk and Return. Dollar Returns. PercentReturn. Learning Objectives. A Brief History of Risk and Return

Averages and Variability. Aplia (week 3 Measures of Central Tendency) Measures of central tendency (averages)

Module Tag PSY_P2_M 7. PAPER No.2: QUANTITATIVE METHODS MODULE No.7: NORMAL DISTRIBUTION

CABARRUS COUNTY 2008 APPRAISAL MANUAL

1) 3 points Which of the following is NOT a measure of central tendency? a) Median b) Mode c) Mean d) Range

MgtOp 215 TEST 1 (Golden) Spring 2016 Dr. Ahn. Read the following instructions very carefully before you start the test.

Statistics (This summary is for chapters 17, 28, 29 and section G of chapter 19)

Lecture Week 4 Inspecting Data: Distributions

Math 227 Elementary Statistics. Bluman 5 th edition

appstats5.notebook September 07, 2016 Chapter 5

Section 6-1 : Numerical Summaries

Refer to Ex 3-18 on page Record the info for Brand A in a column. Allow 3 adjacent other columns to be added. Do the same for Brand B.

Lecture Data Science

1 Describing Distributions with numbers

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

POPULATION SAMPLE GROUP INDIVIDUAL ORDINAL DATA NOMINAL DATA

Statistics for Managers Using Microsoft Excel 7 th Edition

Web Extension: Continuous Distributions and Estimating Beta with a Calculator

Numerical summary of data

UNIT 4 NORMAL DISTRIBUTION: DEFINITION, CHARACTERISTICS AND PROPERTIES

Center and Spread. Measures of Center and Spread. Example: Mean. Mean: the balance point 2/22/2009. Describing Distributions with Numbers.

Data Analysis. BCF106 Fundamentals of Cost Analysis

Lecture 07: Measures of central tendency

Some estimates of the height of the podium

Quantitative Methods for Economics, Finance and Management (A86050 F86050)

Transcription:

ECON 214 Elements of Statistics for Economists Session 3 Presentation of Data: Numerical Summary Measures Part 2 Lecturer: Dr. Bernardin Senadza, Dept. of Economics Contact Information: bsenadza@ug.edu.gh College of Education School of Continuing and Distance Education 2014/2015 2016/2017

Session Overview A measure of average, such as the mean, only locates the centre of the data. But it is also important to know how the data is spread out. This is what measures of dispersion tell us; the spread in the data. This session discusses and illustrates the computation of the various measures of dispersion. Slide 2

Session Overview At the end of the session, the student will Be able to compute and interpret the range, variance and standard deviation from ungrouped data Be able to compute and interpret the range, variance and standard deviation from grouped data Be able to compute and interpret the coefficient of variation, percentiles, quartiles and deciles Be able to compute and interpret interquartile and percentile ranges Be able to describe a data set in terms of its skewness Slide 3

Session Outline The key topics to be covered in the session are as follows: Measures of dispersion for ungrouped data Measures of dispersion for grouped data Other measures of dispersion Slide 4

Reading List Michael Barrow, Statistics for Economics, Accounting and Business Studies, 4 th Edition, Pearson R.D. Mason, D.A. Lind, and W.G. Marchal, Statistical Techniques in Business and Economics, 10 th Edition, McGraw- Hill Slide 5

Topic One MEASURES OF DISPERSION TENDENCY FOR UNGROUPED DATA Slide 6

Measures of Dispersion A measure of average, such as the mean, only locates the centre of the data. But it is also important to know how the data is spread out. This is what measures of dispersion tell us; the spread in the data. A small value for a measure of dispersion indicates that the data are clustered closely around the mean, whereas a large value indicates that the data are widely spread around the mean. We shall consider several measures of dispersion. Slide 7

The Range It is the simplest measure of dispersion. It is calculated as the difference between the highest and the lowest values in the data. Range = highest value lowest value. Consider ECON214 interim assessment results for second semester 2014/2015. Highest mark was 29 (out of 30) and lowest mark was 2 Range = 29 2 = 27 Slide 8

Population Variance The population variance, denoted and pronounced sigma squared, for ungrouped data is the arithmetic mean of the squared deviations from the population mean, and is given by the formula. 2 ( X ) N 2 Slide 9

Population Variance The ages of the Aproni family are 2, 18, 34, and 42 years. What is the population variance? First calculate the mean as X / N (218 34 42) / 4 96 / 4 24 Then obtain the variance as ( X ) / 2 2 N 2 2 2 2 [(2 24) (18 24) (34 24) (42 24) ] / 4 944 / 4 236 Slide 10

Population Variance An alternative formula for the population variance is: Or 2 X N 2 X ( ) N X 2 2 2 N 2 Slide 11

Population Standard Deviation The population standard deviation ( σ, called sigma) is the square root of the population variance. From the previous example, the population standard deviation is 236 15.36 Slide 12

Sample Variance and Standard Deviation The sample variance estimates the population variance. 2 2 ( X X) Conceptual Formula = S n 1 2 ( X ) X 2 Computational Formula = S n n 1 Note: unlike in the population variance formula, n-1 is used in the denominator here so as to obtain an unbiased estimate. The reason is that observations around the sample mean tend to be smaller than that around the population mean. Using n-1 rather than n compensates for this and the result is unbiased. Slide 13 2

X s 2 Sample Variance and Standard Deviation A sample of five hourly wages for various jobs on campus is: 7, 5, 11, 8, 6. Find the variance. X 7 5 11 8 6 37 7.4 n 5 5 n 1 5 1 2 2 2 2 2 ( X X) [(7 7.4) (5 7.4) (11 7.4) (8 7.4) (6 7.4) ] 21.1 5.3 4 The sample standard deviation (s) is the square root of the sample variance. So the sample standard deviation is s 5.3 2.3 Slide 14

Topic Two MEASURES OF DISPERSION FOR GROUPED DATA Slide 15

Sample Variance and Standard Deviation The formula for the sample variance for grouped data used as an estimator of the population variance is: Or s 2 s 2 fx 2 f X X 2 n 1 ( fx ) n n 1 where f is class frequency and X is class midpoint. 2 Slide 16

Sample Variance and Standard Deviation The sample variance gives an unbiased estimate of the population variance. The sample standard deviation is obtained by taking the square root of the sample variance. Slide 17

Topic Three OTHER MEASURES OF DISPERSION Slide 18

Coefficient of Variation The coefficient of variation is the ratio of the standard deviation to the arithmetic mean, expressed as a percentage: CV s X (100) It is used compare the relative dispersion in distributions measured in different units (or measured in same units but are wide apart (for example, the incomes of top executives and unskilled workers)). The relative dispersion measure thus becomes unitless and enables direct comparison of the relative variation in the two distributions. Slide 19

Coefficient of Variation A study of the test scores in management principles and the years of service of the employees enrolled in the course resulted in the following statistics: Mean test score = 200; standard deviation = 40. Mean years of service = 20; standard deviation = 5. CV for test scores = (40/200)*100 = 20 percent CV for years of service = (5/20)*100 = 25 percent Hence although test scores had higher standard deviation, we cannot conclude that it has higher variation is its distribution compared to years of service. The CV shows that there is rather higher variation in years of service. Slide 20

Percentiles, Quartiles and Deciles The median divides the data arranged in ascending order into two equal halves; it is also the value such that 50% of observations are below and 50% above. Percentiles divide a set of observations into 100 equal parts. A percentile is the value such that P% of observations are below and (100-P)% are above this value. Slide 21

Percentiles, Quartiles and Deciles For example, the 10 th percentile is the value such that 10% of observations are below this value and 90% are above. We saw earlier that we can locate the position of the median using the formula; (n+1)/2 We could write it generally as (n+1)(p/100) and since in this case P = 50, we get (n+1)(50/100) = (n+1)/2. Slide 22

Percentiles, Quartiles and Deciles Thus to determine a given percentile in a distribution, we first locate its position using the formula Lp p ( n 1) 100 Consider the following data: 37, 59, 71, 75, 78, 78, 81, 86, 88, 92, 95, 96 Assuming we want the 25 th percentile, then Lp= (12+1)(25/100)=3.25 Hence the 25 th percentile is a quarter of the distance between the 3 rd and 4 th observations, which gives us 71 + 0.25 (75-71) = 71 + 0.25(4) = 72 Slide 23

Percentiles, Quartiles and Deciles Quartiles divide a set of observations into 4 equal parts. Hence there are 3 quartiles; 1 st quartile (which is same as 25 th percentile); 2 nd quartile (which is same as 50 th percentile or the median); and 3 rd quartile (which is same as 75 th percentile). So we just calculated the 1 st quartile (= 25 th percentile). Slide 24

Percentiles, Quartiles and Deciles Similarly, deciles divide a set of observations into 10 equal parts; so there are 9 deciles. The 1 st decile is the same as the 10 th percentile and the 5 th decile is the same as the 50 th percentile or the median. In the same vein, each data set has 99 percentiles, thus dividing the data set into 100 equal parts. The percentile formula described on the previous slide is applied in calculating quartiles as well as deciles. Slide 25

Percentiles, Quartiles and Deciles Assuming we wish to calculate the 90 th, then Lp = (12+1)(90/100) = 11.7 So the 90 th percentile is 70% of the distance between the 11 th and 12 th observations, which gives 95 + 0.7(96-95) = 95.7 Slide 26

Percentiles, Quartiles and Deciles The First Quartile is the value corresponding to the point below which 25% of the observations lie in an ordered data set. For grouped data the formula below is applied. Q 1 L n 4 f CF ( i) where L=lower limit of the class containing Q1, CF= cumulative frequency preceding class containing Q1, f= frequency of class containing Q1, i= size of class containing Q1. Slide 27

Percentiles, Quartiles and Deciles The Third Quartile is the value corresponding to the point below which 75% of the observations lie in an ordered data set: The formula is: Q 3 L 3n 4 CF where L=lower limit of the class containing Q3, CF= cumulative frequency preceding class containing Q3, f= frequency of class containing Q3, i= size of class containing Q3. f ( i) Slide 28

Percentiles, Quartiles and Deciles In general, the percentile for grouped data is calculated using the formula: P L pn CF 100 ( i) f where p=percentile we want to calculate, n=number of observations, L=lower limit of the class containing the percentile, CF= cumulative frequency preceding class containing the percentile, f= frequency of class containing the percentile, i= size of class containing the percentile. Slide 29

Inter-quartile and Percentile Ranges The Inter-quartile range is the distance between the third quartile Q3 and the first quartile Q1. Inter-quartile range = third quartile - first quartile = Q3 - Q1 The percentile range is the distance between two stated percentiles. The 10-to-90 percentile range is the distance between the 10th and 90th percentiles. Slide 30

Skewness Skewness is the measure of the lack of symmetry of a distribution. A symmetrical frequency curve is one for which the right half of the curve is the mirror image of the left half. If one or more observations of a distribution are extremely large, the mean becomes greater than the median or mode, and the distribution is said to be positively (right) skewed. Slide 31

Skewness Conversely, if one or more extremely small values are present in the data, the mean becomes smaller than the median or mode, and the distribution is said to be negatively (left) skewed. The coefficient of skewness is computed from the following formula: Sk = 3(Mean - Median) / (Standard deviation) Slide 32

3-26 Symmetric Distribution zero skewness mode = median = mean Slide 33

3-27 Right Skewed Distribution positively skewed: Mean and Median are to the right of the Mode. Mode<Median<Mean Slide 34

3-28 Left Skewed Distribution Negatively Skewed: Mean and Median are to the left of the Mode. Mean<Median<Mode Slide 35

Skewness Example - The length of stay on the cancer floor of Korle Bu Hospital were organized into a frequency distribution. The mean length of stay is 28 days, the median 25 days and the modal length is 23 days. The standard deviation was computed to be 4.2 days. Is the distribution symmetrical, positively skewed or negatively skewed? Calculate the coefficient of skewness and interpret it. Slide 36

Skewness The distribution is positively skewed because the mean is the largest of the three measures of central tendency. Sk = 3(Mean - Median) / (Standard deviation) Sk = 3(28 25)/4.2 = 2.14 The coefficient of skewness generally lies between -3 and +3. The coefficient of 2.14 indicates a substantial amount of skewness, and shows that a few cancer patients are staying in the hospital for a long time, causing the mean to be larger than the median or mode. Slide 37

References Michael Barrow, Statistics for Economics, Accounting and Business Studies, 4 th Edition, Pearson R.D. Mason, D.A. Lind, and W.G. Marchal, Statistical Techniques in Business and Economics, 10 th Edition, McGraw-Hill Slide 38