Measures of Central Tendency: Ungrouped Data. Mode. Median. Mode -- Example. Median: Example with an Odd Number of Terms

Similar documents
Simple Descriptive Statistics

3.1 Measures of Central Tendency

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1

Numerical Descriptions of Data

Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution

Measures of Central tendency

Engineering Mathematics III. Moments

Chapter 3 Descriptive Statistics: Numerical Measures Part A

Descriptive Statistics

Chapter 2: Descriptive Statistics. Mean (Arithmetic Mean): Found by adding the data values and dividing the total by the number of data.

Some Characteristics of Data

Numerical Measurements

Measures of Dispersion (Range, standard deviation, standard error) Introduction

9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives

David Tenenbaum GEOG 090 UNC-CH Spring 2005

MEASURES OF DISPERSION, RELATIVE STANDING AND SHAPE. Dr. Bijaya Bhusan Nanda,

PSYCHOLOGICAL STATISTICS

CHAPTER 2 Describing Data: Numerical

DESCRIPTIVE STATISTICS

Measures of Center. Mean. 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) Measure of Center. Notation. Mean

Moments and Measures of Skewness and Kurtosis

1 Describing Distributions with numbers

Fundamentals of Statistics

Descriptive Statistics

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

Standardized Data Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis

Description of Data I

Numerical summary of data

Statistics 114 September 29, 2012

Descriptive Analysis

Basic Procedure for Histograms

Frequency Distribution and Summary Statistics

1 Exercise One. 1.1 Calculate the mean ROI. Note that the data is not grouped! Below you find the raw data in tabular form:

DATA SUMMARIZATION AND VISUALIZATION

Monte Carlo Simulation (Random Number Generation)

ECON 214 Elements of Statistics for Economists

appstats5.notebook September 07, 2016 Chapter 5

DESCRIPTIVE STATISTICS II. Sorana D. Bolboacă

2 Exploring Univariate Data

SUMMARY STATISTICS EXAMPLES AND ACTIVITIES

Dot Plot: A graph for displaying a set of data. Each numerical value is represented by a dot placed above a horizontal number line.

Statistics I Chapter 2: Analysis of univariate data

Measures of Central Tendency Lecture 5 22 February 2006 R. Ryznar

Terms & Characteristics

Empirical Rule (P148)

STATS DOESN T SUCK! ~ CHAPTER 4

4. DESCRIPTIVE STATISTICS

Some estimates of the height of the podium

Chapter 3. Descriptive Measures. Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 3, Slide 1

Lectures delivered by Prof.K.K.Achary, YRC

Lecture Week 4 Inspecting Data: Distributions

Handout 4 numerical descriptive measures part 2. Example 1. Variance and Standard Deviation for Grouped Data. mf N 535 = = 25

Chapter 6 Simple Correlation and

A LEVEL MATHEMATICS ANSWERS AND MARKSCHEMES SUMMARY STATISTICS AND DIAGRAMS. 1. a) 45 B1 [1] b) 7 th value 37 M1 A1 [2]

MEASURES OF CENTRAL TENDENCY & VARIABILITY + NORMAL DISTRIBUTION

Establishing a framework for statistical analysis via the Generalized Linear Model

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1

Lecture 1: Review and Exploratory Data Analysis (EDA)

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment

Data Distributions and Normality

Copyright 2005 Pearson Education, Inc. Slide 6-1

Chapter 3-Describing Data: Numerical Measures

Section3-2: Measures of Center

Section-2. Data Analysis

Population Mean GOALS. Characteristics of the Mean. EXAMPLE Population Mean. Parameter Versus Statistics. Describing Data: Numerical Measures

Lecture 2 Describing Data

Percentiles, STATA, Box Plots, Standardizing, and Other Transformations

2018 CFA Exam Prep. IFT High-Yield Notes. Quantitative Methods (Sample) Level I. Table of Contents

NOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS

MBEJ 1023 Dr. Mehdi Moeinaddini Dept. of Urban & Regional Planning Faculty of Built Environment

MATHEMATICS APPLIED TO BIOLOGICAL SCIENCES MVE PA 07. LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1)

STAT 113 Variability

DATA HANDLING Five-Number Summary

Module Tag PSY_P2_M 7. PAPER No.2: QUANTITATIVE METHODS MODULE No.7: NORMAL DISTRIBUTION

CSC Advanced Scientific Programming, Spring Descriptive Statistics

Introduction to Descriptive Statistics

Applications of Data Dispersions

The Normal Distribution & Descriptive Statistics. Kin 304W Week 2: Jan 15, 2012

Categorical. A general name for non-numerical data; the data is separated into categories of some kind.

Measure of Variation

Refer to Ex 3-18 on page Record the info for Brand A in a column. Allow 3 adjacent other columns to be added. Do the same for Brand B.

KARACHI UNIVERSITY BUSINESS SCHOOL UNIVERSITY OF KARACHI BS (BBA) VI

KING FAHD UNIVERSITY OF PETROLEUM & MINERALS DEPARTMENT OF MATHEMATICAL SCIENCES DHAHRAN, SAUDI ARABIA. Name: ID# Section

UNIT 4 NORMAL DISTRIBUTION: DEFINITION, CHARACTERISTICS AND PROPERTIES

2 DESCRIPTIVE STATISTICS

= P25 = Q1 = = P50 = Q2 = = = P75 = Q3

32.S [F] SU 02 June All Syllabus Science Faculty B.A. I Yr. Stat. [Opt.] [Sem.I & II] 1

Quantitative Methods for Economics, Finance and Management (A86050 F86050)

Summarising Data. Summarising Data. Examples of Types of Data. Types of Data

Introduction to Computational Finance and Financial Econometrics Descriptive Statistics

Math146 - Chapter 3 Handouts. The Greek Alphabet. Source: Page 1 of 39

Unit 2 Statistics of One Variable

Summary of Statistical Analysis Tools EDAD 5630

Session 5: Associations

34.S-[F] SU-02 June All Syllabus Science Faculty B.Sc. I Yr. Stat. [Opt.] [Sem.I & II] - 1 -

Descriptive Statistics for Educational Data Analyst: A Conceptual Note

Basic Sta)s)cs. Describing Data Measures of Spread

Section 6-1 : Numerical Summaries

Exploratory Data Analysis

Descriptive Statistics Bios 662

Transcription:

Measures of Central Tendency: Ungrouped Data Measures of central tendency yield information about particular places or locations in a group of numbers. Common Measures of Location Mode Median Percentiles Quartiles Mode The most frequently occurring value in a data set Applicable to all levels of data measurement (nominal, ordinal, interval, and ratio) Bimodal -- Data sets that have two modes Multimodal -- Data sets that contain more than two modes Stat 13D, UCLA, Ivo Dinov. 1 Stat 13D, UCLA, Ivo Dinov. Mode -- Example Median The mode is. There are more s than any other value. 3 37 37 39 4 41 41 43 43 43 4 Middle value in an ordered array of numbers. Applicable for ordinal, interval, and ratio data ot applicable for nominal data Unaffected by extremely large and extremely small values. 4 43 4 48 Stat 13D, UCLA, Ivo Dinov. 3 Stat 13D, UCLA, Ivo Dinov. 4 Median: Computational Procedure First Procedure Arrange the observations in an ordered array. If there is an odd number of terms, the median is the middle term of the ordered array. If there is an even number of terms, the median is the average of the middle two terms. Second Procedure The median s position in an ordered array is given by (n+1)/. Median: Example with an Odd umber of Terms Ordered Array 3 4 7 8 9 11 14 1 17 19 19 1 There are 17 terms in the ordered array. Position of median (n+1)/ (17+1)/ 9 The median is the 9th term, 1. If the is replaced by 1, the median is 1. If the 3 is replaced by -13, the median is 1. Stat 13D, UCLA, Ivo Dinov. Stat 13D, UCLA, Ivo Dinov. 6 Page 1

Median: Example with an Even umber of Terms Ordered Array 3 4 7 8 9 11 14 1 17 19 19 1 There are terms in the ordered array. Position of median (n+1)/ (+1)/ 8. The median is between the 8th and 9th terms, 14.. If the 1 is replaced by 1, the median is 14.. If the 3 is replaced by -88, the median is 14.. Arithmetic Commonly called the mean is the average of a group of numbers Applicable for interval and ratio data ot applicable for nominal or ordinal data Affected by each value in the data set, including extreme values Computed by summing all values in the data set and dividing the sum by the number of values in the data set Stat 13D, UCLA, Ivo Dinov. 7 Stat 13D, UCLA, Ivo Dinov. 8 Population X + + +... + 1 3 µ 4+ 13+ 19+ 6+ 11 93 18. 6 X X X X Sample 1 3 X X X Xn X + + +... + X n n 7+ 86+ 4+ 38+ 9+ 66 6 379 6 63. 7 Stat 13D, UCLA, Ivo Dinov. 9 Stat 13D, UCLA, Ivo Dinov. 1 Percentiles Measures of central tendency that divide a group of data into 1 parts At least n% of the data lie below the nth percentile, and at most (1 - n)% of the data lie above the n th percentile Example: 9th percentile indicates that at least 9% of the data lie below it, and at most 1% of the data lie above it The median and the th percentile are the same. Applicable for ordinal, interval, and ratio data ot applicable for nominal data Stat 13D, UCLA, Ivo Dinov. 11 Percentiles: Computational Procedure Organize the data into an ascending ordered array. Calculate the P percentile location: i () n 1 Determine the percentile s location and its value. If i is a whole number, the percentile is the average of the values at the i and (i+1) positions. If i is not a whole number, the percentile is at the (i+1) position in the ordered array. Stat 13D, UCLA, Ivo Dinov. 1 Page

Percentiles: Example Raw Data: 14, 1, 19, 3,, 13, 8, 17 Ordered Array:, 1, 13, 14, 17, 19, 3, 8 Location of 3th percentile: 3 i 1 () 8 4. The location index, i, is not a whole number; i+1.4+13.4; the whole number portion is 3; the 3th percentile is at the 3rd location of the array; the 3th percentile is 13. Stat 13D, UCLA, Ivo Dinov. 13 Quartiles Measures of central tendency that divide a group of data into four subgroups Q 1 : % of the data set is below the first quartile Q : % of the data set is below the second quartile Q 3 : 7% of the data set is below the third quartile Q 1 is equal to the th percentile Q is located at th percentile and equals the median Q 3 is equal to the 7th percentile Quartile values are not necessarily members of the data set Stat 13D, UCLA, Ivo Dinov. 14 Quartiles Q 1 Q Q 3 % % % % Quartiles: Example Ordered array:, 19, 114, 1, 11, 1, 1, 19 Q 1 + i Q 1 8 19 114 () 1 111. Q : + i Q 1 8 4 1 11 () 118. 7 + Q 3 : i Q 1 8 6 1 1 () 3 13. Stat 13D, UCLA, Ivo Dinov. 1 Stat 13D, UCLA, Ivo Dinov. Variability Variability o Variability in Cash Flow Variability Variability in Cash Flow o Variability Time Stat 13D, UCLA, Ivo Dinov. 17 Stat 13D, UCLA, Ivo Dinov. 18 Page 3

Measures of Variability: Ungrouped Data Measures of variability describe the spread or the dispersion of a set of data. Common Measures of Variability Range Interquartile Range Absolute Deviation Variance Standard Deviation Z scores Coefficient of Variation Range The difference between the largest and the smallest values in a set of data Simple to compute 3 41 Ignores all data points except the 37 41 two extremes 37 43 Example: Range 39 43 Largest - Smallest 4 43 48-3 13 4 43 4 4 48 Stat 13D, UCLA, Ivo Dinov. 19 Stat 13D, UCLA, Ivo Dinov. Interquartile Range Range of values between the first and third quartiles Range of the middle half Less influenced by extremes Interquartile Range Q3 Q1 Stat 13D, UCLA, Ivo Dinov. 1 Deviation from the Data set:, 9,, 17, 18 : X 6 µ 13 Deviations from the mean: -8, -4, 3, 4, -4-8 +3 +4 + 1 1 Stat 13D, UCLA, Ivo Dinov. µ Absolute Deviation Average of the absolute deviations from the mean X X µ X µ 9 17 18-8 -4 +3 +4 + +8 +4 +3 +4 + 4 M. A. D. X µ 4 4. 8 Population Variance Average of the squared deviations from the arithmetic mean X X µ ( ) 9 17 18-8 -4 +3 +4 + X µ ( X µ ) 64 9 13 13 6. Stat 13D, UCLA, Ivo Dinov. 3 Stat 13D, UCLA, Ivo Dinov. 4 Page 4

Population Standard Deviation Square root of the variance X X µ ( X µ ) 9 17 18-8 -4 +3 +4 + 64 9 13 13 6. 6. 1. ( X µ ) Stat 13D, UCLA, Ivo Dinov. Sample Variance Average of the squared deviations from the arithmetic mean X X X ( X X ),398 1,8 1,39 1,311 7,9 6 71-34 - ( X X ) 39,6,41 4,76 13,4 663,866 n 1 663, 866 3 1, 88. 67 Stat 13D, UCLA, Ivo Dinov. 6 S Sample Standard Deviation Square root of the sample variance ( X X ) X X X ( X X ),398 1,8 1,39 1,311 7,9 6 71-34 - 39,6,41 4,76 13,4 663,866 n 1 663, 866 3 1, 88. 67 Stat 13D, UCLA, Ivo Dinov. 7 S S S 1, 88. 67 47. 41 Uses of Standard Deviation Indicator of financial risk Quality Control construction of quality control charts process capability studies Comparing populations household incomes in two cities employee absenteeism at two companies Stat 13D, UCLA, Ivo Dinov. 8 Standard Deviation as an Indicator of Financial Risk Financial Security Annualized Rate of Return µ A 1% 3% B 1% 7% Empirical Rule Data are normally distributed (or approximately normal) Distance from the µ ± 1 µ ± µ ± 3 Percentage of Values Falling Within Distance 68 9 99.7 Stat 13D, UCLA, Ivo Dinov. 9 Stat 13D, UCLA, Ivo Dinov. 3 Page

Chebyshev s Theorem Applies to all distributions P( µ k < X < µ + k) 1 for k>1 1 k Chebyshev s Theorem Applies to all distributions umber of Standard Deviations K K 3 Distance from the Minimum Proportion of Values Falling Within Distance µ ± 1-1/.7 µ ± 3 1-1/3.89 µ ± 4 K 4 1-1/4.94 Stat 13D, UCLA, Ivo Dinov. 31 Stat 13D, UCLA, Ivo Dinov. 3 Coefficient of Variation Ratio of the standard deviation to the mean, expressed as a percentage Measurement of relative dispersion CV.. ( ) µ 1 Coefficient of Variation µ 9 1 1. µ 1 CV.. ( 1) ( 1) 1 CV.. 1 µ 84 1 µ. ( ) 9 1 1 ( ) 84 1 186. 119. Stat 13D, UCLA, Ivo Dinov. 33 Stat 13D, UCLA, Ivo Dinov. 34 Measures of Central Tendency and Variability: Grouped Data Measures of Central Tendency Median Mode Measures of Variability Variance Standard Deviation Absolute Deviation of Grouped Data Weighted average of class midpoints Class frequencies are the weights µ fm f fm f 1M 1 + f M + f 3 M 3 + + fim f 1 + f + f 3 + + fi i Stat 13D, UCLA, Ivo Dinov. 3 Stat 13D, UCLA, Ivo Dinov. 36 Page 6

Calculation of Grouped Class Interval Frequency Class Midpoint fm -under 3 6 1 3-under 4 18 3 63 4-under 11 4 49 -under 6 11 6 6-under 7 3 6 19 7-under 8 1 7 7 1 fm 1 µ f 43. Median of Grouped Data cf p Median L + ( W) fmed Where: L the lower limit of the median class cf p cumulative frequency of class preceding the median class f med frequency of the median class W width of the median class total of frequencies Stat 13D, UCLA, Ivo Dinov. 37 Stat 13D, UCLA, Ivo Dinov. 38 Median of Grouped Data -- Example Mode of Grouped Data Cumulative Class Interval Frequency Frequency -under 3 6 6 3-under 4 18 4 4-under 11 3 -under 6 11 6-under 7 3 49 7-under 8 1 cf p Md L + ( W ) fmed 4 4 + ( 1) 11 4. 99 Midpoint of the modal class Modal class has the greatest frequency Class Interval Frequency -under 3 6 3-under 4 18 4-under 11 -under 6 11 6-under 7 3 7-under 8 1 3 + 4 Mode 3 Stat 13D, UCLA, Ivo Dinov. 39 Stat 13D, UCLA, Ivo Dinov. 4 Variance and Standard Deviation of Grouped Data Population Variance and Standard Deviation of Grouped Data Population f ( M µ ) S S S Sample f ( M X) n 1 Class Interval -under 3 3-under 4 4-under -under 6 6-under 7 7-under 8 f 6 18 11 11 3 1 M 3 4 6 7 ( µ ) fm 1 63 49 6 19 7 1 M µ ( M µ ) f ( ) -18-8 1 3 f 7 1 34 64 4 1 484 14 M M µ 19 11 184 14 14 7 1 1 Stat 13D, UCLA, Ivo Dinov. 41 Stat 13D, UCLA, Ivo Dinov. 4 Page 7

Measures of Shape Skewness Skewness Absence of symmetry Extreme values in one side of a distribution Kurtosis Peakedness of a distribution Leptokurtic: high and thin Mesokurtic: normal shape Platykurtic: flat and spread out Box and Whisker Plots Graphic display of a distribution Reveals skewness egatively (Left) Symmetric (ot ) Positively (Right) Stat 13D, UCLA, Ivo Dinov. 43 Stat 13D, UCLA, Ivo Dinov. Median egatively Mode Skewness Median Mode Symmetric (ot ) Mode Median Positively Coefficient of Skewness Summary measure for skewness If S <, the distribution is negatively skewed (skewed to the left). If S, the distribution is symmetric (not skewed). If S >, the distribution is positively skewed (skewed to the right). Stat 13D, UCLA, Ivo Dinov. 4 Stat 13D, UCLA, Ivo Dinov. Skewness & Kurtosis What do we mean by symmetry and positive and negative skewness? Kurtosis? Properties?!? 3 4 ( Yk Y ) ( Yk Y ) k 1 k 1 Skewness ; Kurtosis 3 4 ( 1) SD ( 1) SD Skewness in linearly invariant Sk(aX+b)Sk(X) Skewness is a measure of unsymmetry Kurtosis is a measure of flatness Both are use to quantify departures from Stdormal Skewness(Stdorm); Kurtosis(Stdorm)3 Stat 13D, UCLA, Ivo Dinov. 47 Kurtosis Peakedness of a distribution Leptokurtic: high and thin Mesokurtic: normal in shape Platykurtic: flat and spread out Leptokurtic Mesokurtic Platykurtic Stat 13D, UCLA, Ivo Dinov. 48 Page 8

Box and Whisker Plot Five secific values are used: Median, Q First quartile, Q 1 Third quartile, Q 3 Minimum value in the data set Maximum value in the data set Inner Fences IQR Q 3 -Q 1 Lower inner fence Q 1-1. IQR Upper inner fence Q 3 + 1. IQR Outer Fences Lower outer fence Q 1-3. IQR Upper outer fence Q 3 + 3. IQR Minimum Box and Whisker Plot Q 1 Q Q 3 Maximum Stat 13D, UCLA, Ivo Dinov. 49 Stat 13D, UCLA, Ivo Dinov. Skewness: Box and Whisker Plots, and Coefficient of Skewness S < S S > Pearson Product-Moment Correlation Coefficient egatively Symmetric (ot ) Positively r SSXY ( SSX )( SSY) ( X X)( Y Y) ( X X) ( Y Y) ( X)( Y) XY n ( X ) ( Y) X n Y n 1 r 1 Stat 13D, UCLA, Ivo Dinov. 1 Stat 13D, UCLA, Ivo Dinov. Three Degrees of Correlation r < r > r Computation of r for the Economics Example (Part 1) Futures Day Interest X Index Y X Y XY 1 7.43 1. 48,841 1,64.3 7.48.9 49,84 1,66.6 3 8. 6 64. 1,76 1,88. 4 7.7 6.63,6 1,743.7 7.6 4 7.76,176 1,7.4 6 7.63 3 8.17 49,79 1,71.49 7 7.68 3 8.98 49,79 1,71.64 8 7.67 6 8.89 1,76 1,733.4 9 7.9 6 7.68 1,76 1,71.34 1 8.7 3 6.1, 1,896.4 11 8.3 33 64.481 4,89 1,87.99 1 8. 41 64. 8,81 1,98. Summations 9.93,7 7. 619,7 1,11.7 Stat 13D, UCLA, Ivo Dinov. 3 Stat 13D, UCLA, Ivo Dinov. 4 Page 9

Computation of r for the Economics Example (Part ) Scatter Plot and Correlation Matrix for the Economics Example ( X)( Y) XY r n ( X ) ( Y) X Y n n ( 9. 93)( 7) ( 111,. 7) 1 ( ) ( ) ( ) 9. 93 ( ) 7 7. 619, 7 1 1. 81 Futures Index 4 4 3 3 7.4 7.6 7.8 8. 8. Interest Interest Futures Index Interest 1.814 Futures Index.814 1 Stat 13D, UCLA, Ivo Dinov. Stat 13D, UCLA, Ivo Dinov. 6 // Palindrome Testing Program (1 of ) // test for palindrome property Cstrings vs Strings! #include <iostream> #include <string> #include <cctype> using namespace std; void swap(char& lhs, char& rhs); //swaps char args corresponding to parameters lhs and rhs string reverse(const string& str); //returns a copy of arg corresponding to parameter //str with characters in reverse order. string removepunct(const string& src, const string& punct); //returns copy of string src with characters //in string punct removed string makelower (const string& s); //returns a copy of parameter s that has all upper case //characters forced to lower case, other characters unchanged. //Uses <string>, which provides tolower bool ispal(const string& this_string); //uses makelower, removepunct. //if this_string is a palindrome, // return true; //else // return false; 7 // Palindrome Testing Program ( of ) int main() string str; cout << "Enter a candidate for palindrome test " << "\nfollowed by pressing return.\n"; getline(cin, str); if (ispal(str)) cout << "\"" << str + "\" is a palindrome "; else cout << "\"" << str + "\" is not a palindrome "; cout << endl; return ; void swap(char& lhs, char& rhs) char tmp lhs; lhs rhs; rhs tmp; 8 // Palindrome Testing Program (3 of ) string reverse(const string& str) int start ; int end str.length(); string tmp(str); while (start < end) end--; swap(tmp[start], tmp[end]); start++; return tmp; //Returns arg that has all upper case characters forced to lower case, //other characters unchanged. makelower uses <string>, which provides //tolower string makelower(const string& s) //uses <cctype> string temp(s); //This creates a working copy of s for (int i ; i < s.length(); i++) temp[i] tolower(s[i]); return temp; 9 // Palindrome Testing Program (4 of ) //returns a copy of src with characters in punct removed string removepunct(const string& src, const string& punct) string no_punct; int src_len src.length(); int punct_len punct.length(); for(int i ; i < src_len; i++) string achar src.substr(i,1); int location punct.find(achar, ); //find location of successive characters //of src in punct if (location < location > punct_len) no_punct no_punct + achar; //achar not in punct -- keep it return no_punct; 6 Page 1

// Palindrome Testing Program ( of ) //uses functions makelower, removepunct. //Returned value: //if this_string is a palindrome, // return true; //else // return false; bool ispal(const string& this_string) string punctuation(",;:.?!'\" "); //includes a blank string str(this_string); str makelower(str); string lowerstr removepunct(str, punctuation); return lowerstr reverse(lowerstr); 61 Page 11