SUMMARY STATISTICS EXAMPLES AND ACTIVITIES

Similar documents
Measures of Dispersion (Range, standard deviation, standard error) Introduction

Descriptive Statistics

MEASURES OF DISPERSION, RELATIVE STANDING AND SHAPE. Dr. Bijaya Bhusan Nanda,

3.1 Measures of Central Tendency

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1

Ti 83/84. Descriptive Statistics for a List of Numbers

ECON 214 Elements of Statistics for Economists

NOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS

Fundamentals of Statistics

A LEVEL MATHEMATICS ANSWERS AND MARKSCHEMES SUMMARY STATISTICS AND DIAGRAMS. 1. a) 45 B1 [1] b) 7 th value 37 M1 A1 [2]

Measures of Central tendency

DESCRIPTIVE STATISTICS

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

Statistics 114 September 29, 2012

Descriptive Statistics

Numerical Descriptions of Data

2 DESCRIPTIVE STATISTICS

Numerical Measurements

CSC Advanced Scientific Programming, Spring Descriptive Statistics

MBEJ 1023 Dr. Mehdi Moeinaddini Dept. of Urban & Regional Planning Faculty of Built Environment

MgtOp S 215 Chapter 8 Dr. Ahn

Handout 4 numerical descriptive measures part 2. Example 1. Variance and Standard Deviation for Grouped Data. mf N 535 = = 25

DATA SUMMARIZATION AND VISUALIZATION

Categorical. A general name for non-numerical data; the data is separated into categories of some kind.

Some Characteristics of Data

Unit 2 Statistics of One Variable

DATA HANDLING Five-Number Summary

IOP 201-Q (Industrial Psychological Research) Tutorial 5

SOLUTIONS: DESCRIPTIVE STATISTICS

CHAPTER 2 Describing Data: Numerical

Simple Descriptive Statistics

Frequency Distribution and Summary Statistics

Basic Procedure for Histograms

CHAPTER TOPICS STATISTIK & PROBABILITAS. Copyright 2017 By. Ir. Arthur Daniel Limantara, MM, MT.

SOLUTIONS TO THE LAB 1 ASSIGNMENT

Measures of Variation. Section 2-5. Dotplots of Waiting Times. Waiting Times of Bank Customers at Different Banks in minutes. Bank of Providence

Description of Data I

Standardized Data Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis

Statistics (This summary is for chapters 18, 29 and section H of chapter 19)

Measure of Variation

Measures of Central Tendency: Ungrouped Data. Mode. Median. Mode -- Example. Median: Example with an Odd Number of Terms

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment

Measures of Center. Mean. 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) Measure of Center. Notation. Mean

GETTING STARTED. To OPEN MINITAB: Click Start>Programs>Minitab14>Minitab14 or Click Minitab 14 on your Desktop

PSYCHOLOGICAL STATISTICS

Math 140 Introductory Statistics. First midterm September

CHAPTER 6. ' From the table the z value corresponding to this value Z = 1.96 or Z = 1.96 (d) P(Z >?) =

Engineering Mathematics III. Moments

Population Mean GOALS. Characteristics of the Mean. EXAMPLE Population Mean. Parameter Versus Statistics. Describing Data: Numerical Measures

Monte Carlo Simulation (Random Number Generation)

Lecture Week 4 Inspecting Data: Distributions

STASTICAL METHODOLOGY FOR DEVELOPING TIME STANDARDS American Association for Respiratory Care 2011 All Rights Reserved

Chapter 6 Simple Correlation and

6683/01 Edexcel GCE Statistics S1 Gold Level G2

Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution

STATISTICS STUDY NOTES UNIT I MEASURES OF CENTRAL TENDENCY DISCRETE SERIES. Direct Method. N Short-cut Method. X A f d N Step-Deviation Method

The Normal Probability Distribution

LAB 2 INSTRUCTIONS PROBABILITY DISTRIBUTIONS IN EXCEL

Chapter 2: Descriptive Statistics. Mean (Arithmetic Mean): Found by adding the data values and dividing the total by the number of data.

Numerical summary of data

David Tenenbaum GEOG 090 UNC-CH Spring 2005

Continuous Probability Distributions & Normal Distribution

DazStat. Introduction. Installation. DazStat is an Excel add-in for Excel 2003 and Excel 2007.

Exploring Data and Graphics

1 Describing Distributions with numbers

Section3-2: Measures of Center

STAB22 section 1.3 and Chapter 1 exercises

Statistics (This summary is for chapters 17, 28, 29 and section G of chapter 19)

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Applications of Data Dispersions

MgtOp 215 TEST 1 (Golden) Spring 2016 Dr. Ahn. Read the following instructions very carefully before you start the test.

Dot Plot: A graph for displaying a set of data. Each numerical value is represented by a dot placed above a horizontal number line.

x is a random variable which is a numerical description of the outcome of an experiment.

2 Exploring Univariate Data

22.2 Shape, Center, and Spread

Measures of Central Tendency Lecture 5 22 February 2006 R. Ryznar

Empirical Rule (P148)

MEASURES OF CENTRAL TENDENCY & VARIABILITY + NORMAL DISTRIBUTION

The Mode: An Example. The Mode: An Example. Measure of Central Tendency: The Mode. Measure of Central Tendency: The Median

Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc.

MAS1403. Quantitative Methods for Business Management. Semester 1, Module leader: Dr. David Walshaw

Edexcel past paper questions

DESCRIPTIVE STATISTICS II. Sorana D. Bolboacă

Some estimates of the height of the podium

Tutorial Handout Statistics, CM-0128M Descriptive Statistics

Lecture 2 Describing Data

Review: Chebyshev s Rule. Measures of Dispersion II. Review: Empirical Rule. Review: Empirical Rule. Auto Batteries Example, p 59.

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1

9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives

Numerical Descriptive Measures. Measures of Center: Mean and Median

Math 227 Elementary Statistics. Bluman 5 th edition

Section Introduction to Normal Distributions

Statistics S1 Advanced/Advanced Subsidiary

MAS187/AEF258. University of Newcastle upon Tyne

STAT 157 HW1 Solutions

Data screening, transformations: MRC05

Refer to Ex 3-18 on page Record the info for Brand A in a column. Allow 3 adjacent other columns to be added. Do the same for Brand B.

Statistics for Managers Using Microsoft Excel 7 th Edition

Chapter 4. The Normal Distribution

LESSON 7 INTERVAL ESTIMATION SAMIE L.S. LY

Transcription:

Session 6 SUMMARY STATISTICS EXAMPLES AD ACTIVITIES

Example 1.1 Expand the following: 1. X 2. 2 6 5 X 3. X 2 4 3 4 4. X 4 2 Solution 1. 2 3 2 X X X... X 2. 6 4 X X X X 4 5 6 5 3. X 2 X 3 2 X 4 2 X 5 2 3 4 4. X 4 X 2 4 X 3 4 X 4 4 2 Example 1.2 Write the following in symbols: 5. X1 X 2... X 6. X 4 X5 X 6 7. X 5 X 5 X 5 3 4 5 Solution X X... X X 5. 1 2 1 6. X 4 X 5 X 6 X 6 4 5 7. X 3 5 X 4 5 X 5 5 X 5 3 40

Activity 1.1 Expand the following: 1. X 2. 4 6 X 3. 3 8 5 X Write the following in sigma notations: 4. X 2 X3... X 5. X5 X6 X7 X8 6. X 3 X 3 X 3 5 6 7 Example 1.3 Raw Data Find the mean of the values 4, 5, 2, 3, 10. X X 452310 24 4 5 4.8 Example 1.4 Grouped Frequency Data Find the arithmetic mean of the following data: 2, 2, 2, 3, 3, 4, 4, 4, 5, 5 Solution We may obtain the mean using the formula (5.1) 41

X 32 23 34 25 34 3.4 3 2 3 2 10 Example 1.5 The table below is a grouped frequency distribution of ages of 34 children. Use it to find the mean age of the children. Age f x fx 1-2 5 2 10 4-6 6 5 30 7-9 10 8 80 10-12 4 11 44 13-15 6 14 84 16-18 3 17 51 = 34 fx 299 X fx 299 8.8 34 This method is called the long method and involves a lot of tedious calculations. When you are confronted with large values in the class intervals, the formula in (5.1) makes the computation quite tedious. To avoid this, there is the use of two short methods I will introduce: (i) The Assumed Mean method and (ii) The Coding method. Activity 1.2 Use the following frequency distribution to find the arithmetic mean: Marks 10 19 20 29 30 39 Frequency 2 3 8 42

40 49 50-59 5 2 Example 1.6 Using the data in Example 1.4 and an Assumed mean A = 11, find the mean age of the children. Solution Age f x d = x-a fd 1 3 5 2-9 - 45 4 6 6 5-6 - 36 7 9 10 8-3 - 30 10 12 4 11 0 0 13 15 6 14 + 3 18 16 18 3 17 + 6 18 = 34 Σfd = - 75 Using (5.2), we have: X A fd = 11 + (- 75)/34 = 11 2.2 = 8.8 43

Example 1.7 Use the coding formula to obtain the mean for the following data: Exact Class Boundaries 0 10 10 20 20 30 30 40 40 50 Midpoints (x) 5 15 25 35 45 Frequency (f) 12 15 28 25 20 100 Solution Here, c = 10, we take A = 25. Therefore Exact Class Boundaries 0 10 10 20 20 30 30 40 40 50 Therefore: x 25 u So we have: 10 Midpoints (x) Frequency (f) x 25 fu u 10 5 12-2 -24 15 15-1 -15 25 28 0 0 35 25 1 25 45 20 2 40 100 26 u fu 26 0.26 f 100 To find the mean, we use the formula (5.3): X A cu 25 100.26 27.6 44

Activity 1.3 (a) Using an assumed mean of 34.5, compute the arithmetic mean for the data in Activity 1.2. (b) Use the coding method to obtain the mean for the data in Activity 1.2 Activity 1.4 Determine the median for the following ungrouped data: (a) 10, 5, 13, 12, 15, 17, 21, 7, 9 (b) 10, 8, 7, 15, 21, 22, 23, 14, 8, 20 Example 1.8 Find the median for the table of age distribution of 34 children below: Age Frequency (f) less than cumulative frequency 1 3 5 5 4 6 6 11 7 9 10 21 10 12 4 25 13 15 6 31 16 18 3 34 n = 34 Step 1 Find n 2 of cases = 34/2 = 17. This means we need to count 17 cases out of 34 to locate the class in which the median would lie. Step 2 45

The lowest class (1 3) has 5 cases; therefore the median would not lie in that class. The next class (4 6) has 6 cases and when this is added to the 5 in the previous class we have cumulatively 11 cases. This means the median would not lie in the (4 6) class. The next class (7 9) has 10 cases. If we add the 10 cases to the 11 we have, we would get 21 and this is more than the 17 cases we are looking for. This means we cannot take all the 10 cases lying in this class. This means that the median would lie in the (7 9) class and it becomes the median class. Step 3 Having determined the median class we can now compute the median using the expression given as follows: L 1 = Lower class boundary of class in which median falls: = 6.5 = Total frequency:= 34 Σf 1 = Sum of frequencies in classes below median class: =11 f med = Frequency of median class: =10 c = Class width for the table:= 3 Step 4 Substituting in the expression, we have: f 1 /2 Median L1 c f med 34 11 6.5 2 Median 3 10 6 6.5 3 6.51.8 8.3 10 46

Activity 1.5 List three examples of data you can compute the median from. (a)------------------------------------------- (b) ------------------------------------------ (c) ------------------------------------------ 47

Example 1.9 From the table of age distribution of 34 children the following values are computed L 1 = 6.5 1 = 4 2 = 6 C = 3 Age Distribution of 34 Children Age f 1 3 5 4 6 6 7 9 10 10 12 4 13 15 6 16 18 3 = 34 Mode = L 1 1 + 1 + 2 c 4 = 6.5 4 6 3 6.5 1.2 7.7 Example 1.10 Given that for particular distribution, the mean = 125.7 and the median = 124.9, find the mode. Mode 3Median 2Mean 3124.9 2125.7 123.3 48

Example 1.11 Solution Find Q 1 using the sampled data in an ordered array: 11 12 13 16 16 17 18 21 22 First, note that n = 9. Q = is in the (9+1)/4 = 2.5 ranked value of the ranked data, so use the value half 1 way between the 2 nd and 3 rd ranked values, so Q = 12.5 1 Activity 1.5 Using the Example, compute Q 2 and Q 3. Example 2.1 Compute the MD for the following data: 72, 81, 86, 69, 57 Solution X 365 X 73 5 X X X 72 81 86 69 57 1 8 13 4 16 42 X X 42 MD = 8.4 5 49

This means that on average, the values differ from the mean in absolute terms by 8.4 Example 2.2 Using the raw data constituting the population of values: 72, 81, 86, 69, 57, then the population standard deviation is computed as follows: X X 2 72 81 86 69 57-1 8 13-4 -16 X 2 506 101.2 10.06 5 X 1 64 169 16 256 506 Example 2.3 The population standard deviation σ, for the following data set: 72, 81, 86, 69, 57 is computed as follows: X X 2 72 5184 81 6561 86 7396 69 4761 57 3249 365 27,151 X 2 2 2 365 X 27,151 5 10.059 5 50

Example 2.4 Marks 60 62 63 65 66 68 69 71 72 74 Frequency (f) Class Midpoint (X) X 2 fx fx 2 5 61 3,721 305 18,605 18 64 4,096 1,152 73,728 42 67 4,489 2,814 188,538 27 70 4,900 1,890 132,300 8 73 5,329 584 42,632 100 6,745 455,803 fx 2 2 2 6745 fx 455803 100 8.5275 2.92 100 Example 2.5 Using data from the previous table of the distribution of 34 children we develop the following working sheet: Marks 60 62 63 65 66 68 69 71 72 74 Frequency (f) Class Midpoint (X) d = X- A fd fd 2 5 61-6 -30 180 18 64-3 -54 162 42 67 0 0 0 27 70 3 81 243 8 73 6 48 288 100 45 873 From the table above, we have the following sums: Σf =100, Σfd = 45, Σfd 2 = 873 s 2 2 fd 2 45 f f 1 100 1 fd 873 100 8.6136 2.93 51

Example 2.6 This method is recommended for group data when class interval sizes are equal. It is called the coding method. Marks 60 62 63 65 66 68 69 71 72 74 Frequency (f) 5 18 42 27 8 Class Midpoint (X) 61 64 67 70 73 u X 67-2 - 1 0 1 2 3 fu fu 2-10 -18 100 45 97 0 27 16 20 18 0 27 32 From the table above, we have the following sums: c = 3, Σf =100, Σfu = 15 and Σfu 2 = 97 s fu 2 2 f 2 fu 15 97 c 3 100 3 0.9571 2.93 f 1 100 1 Activity 2.1 1. Find the standard deviation of the following sets of data: (a) 3, 6, 2, 1, 7, 5 (b) 3.2, 4.6, 2.8, 5.2, 4.4 2. The following is the age distribution of 40 students from the University of Ghana: Age distribution of 40 Students Age (f) 18-22 8 23-27 7 28-32 10 33-37 5 52

38-42 7 43-47 3 =40 (a) Find the range and the mean deviation for the table: (b) Compute the standard deviation for the age distribution for the students. 53

Example 3.1 Suppose that two stocks, A and B have the following summary statistics: Stock A: Average price last year = GHC 50 Standard deviation = GHC 5 Stock B: Average price last year = GHC 100 Standard deviation = GHC 5 Which of these two stocks price is less variable? Solution s GHC 5 CVA 100% 100% 10% X GHC 50 s GHC 5 CVB 100% 100% 5% X GHC 100 From the two CVs computed, you can observe that the two stocks have the same standard deviation but stock B is less variable compared to its price. Activity 3.1 The points of two football teams in ten league seasons are given below. Which of the two teams are more consistent? Team A: 32 28 47 63 71 39 10 60 96 14 Team B: 19 31 48 53 67 90 10 62 40 80 [Hint: Compute the CV of each team and compare the results] Example 3.2 Suppose the mean math SAT score is 490, with a standard deviation of 100. Compute the z-score for a test score of 620. Solution X X 620 490 130 Z 1.3 s 100 100 A score of 620 is 1.3 standard deviations above the mean and would not be considered an outlier. 54

Activity 3.2 The average performance in mathematics in school A is 80 with a standard deviation of 10. That of school B is 70 and 15 respectively. Adei who is in school A, scored 75 while his friend Obonto, in school B, scored 65. Compare their relative performance in mathematics. 55

Example 4.1 If for a given data set, it is found that: x 10, Mode 8, s 4, then we have P S k x Mode 10 8 s 4 0.5 (i) Bowley s Coefficient of Skewness B Example 4.2 S k Q 2Q Q 1 2 3 BS k Q3 Q1 If for a given data set, it is found that:, 1 1 Q 20, Q 50, Q 80, 1 2 3 then we have B S k Q 2Q Q Q Q 1 2 3 3 1 20 2 50 80 80 20 0 ote that a skewness measure of zero indicates that the underlying distribution is symmetric. Activity 4.1 The following is the age distribution of 40 students from the University of Ghana: Age distribution of 40 Students Age (f) 18-22 8 23-27 7 28-32 10 33-37 5 38-42 7 56

43-47 3 =40 (a) Compute a measure of skewness and commen on your result. (b) Compute a measure of kutosis and comment on your results. Activity 5.1 The following are age distributions for a group of pupils: 14, 8, 9, 5, 17, 12, 20, 7, 6, 2, 4, 7, 8, 9, 8, 17. (a) Compare the quartile values for the distribution. (b) Obtain a box-and-whisker plot for the distribution and use it to describe the shape of the distribution. 57

Section 6: Exploratory Data Analysis using Excel Introduction Welcome to Section 6. Here we will be illustrating the computations of the summary measures we have discussed in Unit 5 i.e. the mean, median, mode, standard deviation, variance, the quartiles, coefficient of skewness, coefficient of kurtosis and the Box-and-whisker plot. Obectives At the end of this section, you will be able to: Use Excel to obtain summary measures to describe the centre, spread and shape of a distribution. General Descriptive Statistics Using Microsoft Excel We illustrate how to obtain general descriptive statistics of a distribution using Microsoft Excel. Example The following are the marks obtained by 50 students in an examination: 83 64 84 76 84 54 75 59 70 61 63 80 84 73 68 52 65 90 52 77 95 36 78 61 59 84 95 47 87 60 54 79 47 64 36 45 51 24 78 26 33 60 18 68 35 58 48 53 55 45 Solution 1. Enter the data into one column of cells as shown. 58

2. Select Data. 3. Select Data Analysis. 4. Select Descriptive Statistics and click OK. 59

5. Select the cell range that contains the data and put into the Input Range: 6. Check Labels in first row. 7. Select ew worksheet Ply: to store the output in a new worksheet. 8. Select Summary statistics. 9. Click OK. Microsoft Excel descriptive statistics output 60

Activity 5.1 Collect data on ages of a sample of 50 students in your class. Enter the data in Excel and perform an exploratory data analysis to obtain summary statistical measures. Write a brief report of your findings in not more than one page. 61