Center and Spread. Measures of Center and Spread. Example: Mean. Mean: the balance point 2/22/2009. Describing Distributions with Numbers.

Size: px
Start display at page:

Download "Center and Spread. Measures of Center and Spread. Example: Mean. Mean: the balance point 2/22/2009. Describing Distributions with Numbers."

Transcription

1 Chapter 3 Section3-: Measures of Center Section 3-3: Measurers of Variation Section 3-4: Measures of Relative Standing Section 3-5: Exploratory Data Analysis Describing Distributions with Numbers The overall pattern of the distribution of a quantitative variable is described by its shape, center, and spread. By inspecting the histogram we can describe the shape of the distribution, but as we saw, we can only get a rough estimate for the center and spread. We need a more precise numerical description of the center and spread of the distribution. 1 Center and Spread Measures of Center and Spread Wide spread Narrow spread Mean Median Mode CENTER SPREAD Range Inter-quartile range (IQR) Variance/Standard deviation About the same center Mean: the balance point Mean: x x-bar the sum of the observations divided by the number of observations). If the n observations are x 1, x, x 3,K, x n then n xi x1 + x + x3 + K+ xn i x = = = 1 n n Example: Mean x = 5, x = 7, x = 3, x = 38, x = 7 1 3, 4 5 The meanis x x1 + x + x3 + x4 + x5 i= 1 x = = n n 5 x = = 1 hours 5 5 i 1

2 Median The median M is the midpoint of the distribution (like the median strip in a road) It is the number such that half of the observations fall above and half fall below. How to find the median? 1.Order the data from smallest to largest.. If n is odd, the median M is the center observation in the ordered list. This observation is the one "sitting" in the (n+1)/ spot in the ordered list.. If n is even, the median M is the mean of the two center observations in the ordered list. These two observations are the ones "sitting" in the n/ and n/ + 1 spots in the ordered list. Median Example: Median x = 5, x = 7, x = 3, x = 38, x = 7 1 3, 4 5 Step 1:order the data: Median = 7 Example : Median If the data are x = 5, x = 7, x = 3, x = , 4 Step 1:order the data: Data set A: median is 68 mean is 68.1 Example outlier Data set B: median is still 68 mean is 16 Median = = 6 The mean is very sensitive to outliers, while the median is resistant to outliers.

3 Comparing the mean and the median Mean describes the center as an average value, where the actual values of the data points play an important role Sensitive to outliers Median locates the middle value as the center, and the order of the data is the key to finding it. Not sensitive to outliers Symmetric distributions with no outliers Mean Median Left-skewed distributions Right-skewed distributions The tail pulls the mean. The tail pulls the mean. Mean < Median Median=68 Mean=7 Median< Mean Which measure of center to use? We will therefore use the meanas a measure of center for symmetric distributions with no outliers. Otherwise, the median will be a more appropriate measure of the center of our data. Another measure of center: Mode Mode is the most frequent value in the data set Data:, 4, 4, 4, 5, 5, 6, 7, 8, 10, 15 Mode = 4 Data:, 4, 4, 4, 5, 5, 5, 6, 7, 8, 10, 15 Mode = 4, 5 Data:, 4, 5, 6, 7, 8, 10, 15 No mode 3

4 Summary: Measures of Center The three main numerical measures for the center of a distribution are the mean x, the medianm, and the mode. The mean is the average value, while the median is the middle value. The mode is the most frequent value. The mean is very sensitive to outliers, while the median is resistant to outliers. The mean is an appropriate measure of center only for symmetric distributions with no outliers. In all other cases, the median should be used to describe the center of the distribution. Spread Spread: how far from the center the data tend range. If all the data points are identical, there would be no spread at all. Numerically, the spread would be zero. Ex.: Center: 5 Spread: 0 Measures of Spread Measures of Spread Range, Inter-quartile range (IQR), Variance / Standard deviation These measures provide different ways to quantify the variability of the distribution. Range Range = max. value min. value Inter-quartile range (IQR) the IQR gives the range covered by the MIDDLE 50% of the ordered data Example: Data:, 4, 4, 4, 5, 5, 6, 7, 8, 10, 15 Range= max.-min. = 15- = 13 4

5 How to find the IQR? Step 1:arrange the data in increasing order Step :find the median How to find the IQR? Step 3: Find the median of the lower 50% of the data. This is called the first quartile of the distribution and is denoted by Q1. How to find the IQR? Step 4: Repeat this again for the top 50% of the data. Find the median of the top 50% of the data. This is called the third quartile of the distribution and is denoted by Q3. IQR The middle 50% of the data falls between Q1 and Q3, and therefore: IQR = Q3 -Q1 IQR Example Weights of 10 students: 10, 118, 10, 136, 138, 149, 157, 157, 161, M = = IQR = = 37 5

6 Note The IQR should be used as a measure of spread of a distribution only when the median is used as a measure of center. Median IQR Using the IQR to detect outliers The 1.5(IQR) Criterion for Outliers An observation is considered a suspected outlier if it is below Q1-1.5(IQR) or above Q (IQR) The 1.5(IQR) Criterion for Outliers Example 1 Weights of 10 students: 10, 118, 10, 136, 138, 149, 157, 157, 161, 15 + M = = IQR = = 37 Q IQR = ( 37) = 1. 5 Anything above 1.5? YES. 15 IS an outlier. Example Outliers! Outlier! Data: -15, 8, 9, 1, 14, 19,, 3, 3, 45, 50 M IQR= 3-9 = (IQR)= 1.5 (14) =1 Anything below Q1-1.5(IQR)=9 1 = -1? Anything above Q3+1.5(IQR)=3 +1 = 44? YES! YES! Five-number summary To get a quick summary of both center and spread, we consider these five numbers: Mininum value Q1 Median Five-number summary Q3 Maximum value 6

7 Boxplot John Tukeyinvented another kind of display to show off the fivenumber summary. It s called boxplot. Example 1 Weights of 10 students: 10, 118, 10, 136, 138, 149, 157, 157, 161, 180 Min. + M = = Max Example Outlier Weights of 10 students: 10, 118, 10, 136, 138, 149, 157, 157, 161, 15 Min. + M = = New Max * Example Largest = max = 6.1 = third quartile = 4.35 M= median = 3.4 = first quartile =. Smallest = min = 0.6 Years until death Disease X Five-number summary Interpretation Comparing distributions Low variability High variability Boxplotsare best used for side-byside comparison of more than one distribution. 7

8 Summary Measures of the center of distributions: Mean Median Mode Measures of spread of distributions: Range IQR Using IQR to detect outliers the 1.5(IQR) rule Boxplots Variance/Standard deviation Variance and Standard Deviation: The idea The standard deviation gives the average (or typical distance) between a data point and the mean, x The formulas We have n observations: 1 Variance: ( x1 x) + ( x x) + K+ ( xn x) s = n 1 Standard deviation: s = x, x, K, x n ( x1 x) + ( x x) + K+ ( xn x) n 1 Example: Video Store Customers The following are the number of customers who entered a video store in 8 consecutive hours: 7, 9, 5, 13, 3, 11, 15, 9 Hour 1 st nd 3 rd 4 th 5 th 6 th 7 th 8 th # of customers x = 7, x = 9, x = 5, x = 13, x = 3, x = 11, x = 15, x = Find the mean and the standard deviation of the distribution Dotplot and the Mean 7 x = = 9 8 Mean = 9 Standard deviation: Steps 1,, and 3 Observations x i Deviations xi x Squared deviations ( xi x) = - (-) = = 0 (0) = = -4 (-4) = = 4 (4) = = -6 (-6) = = () = = 6 (6) = = 0 (0) = 0 Mean: 9 Sum= 0!!!!!!!!!! Sum=

9 Variance s Standard deviation s Steps 4, and 5 ( x x) + ( x x) + K+ ( xn x) = n s = 8 1 = 16 s = = Variance = 16 = 4 Variance The typical distance from the mean is 4. FAQ about the Standard Deviation 1. Why do we need to square the deviations? Because the sum of the deviations from the mean is ALWAYS 0!. Why do we divide by n-1 and not by n? Because we know (question 1) that the sum of the deviations is always 0, so that knowing n-1 of them determines the last one. Only n-1 of the squared deviations can vary freely. The number n-1 is called the degrees of freedom FAQ about the Standard Deviation 3. Why do we take the square root? s =16 is an average of the squared deviations, and therefore has different units of measurement. In this case 16 is measured in "squared customers", which obviously cannot be interpreted. We therefore, take the square root in order to go back to the original units of measurement. Facts about the standard deviation (s) smeasures the spread about the mean and should be used only when the mean is chosen as the measure of center. That is, when the distribution of the data is roughly symmetric with no outliers. Mean Standard deviation 51 5 Facts about the Standard Deviation (s) sis always zero or greater than 0. s= 0 only when there is no spread, i.e., the data values are identical. sgets larger as the spread increases. shas the same units of measurements as the original observations. Like the mean, sis not resistant. It is very sensitive to outliers. Calculator (TI-83, TI-84) Steps: 1. Enter your data into a List: STAT EDIT 1: Edit Enter you data into L1. Find the mean, median, standard deviation, five-number summary STAT CALC 1: 1-Var Stats You see in your window 1-Var Stats (L1) nd

10 Try it: 3, 18, 19, 5, 7, 7, 0, 17, 4 x = Mean x = 190 x Sx = σx = n = 9 min X = 7 Q = 185. Med = 3 Q 1 3 = 43 = 6 max X = 7 Standard deviation Number of entries Five-number summary Measures of Relative Standing We can compare values from different data sets using z-scores: x mean z = s. d. A z-score measures the number of standard deviations that a data value x is from the mean. Ordinary values: - z-score Unusual values: z-score < - or z-score > 55 Example IQ scores have a mean of 100 and a standard deviation of 16. Albert Einstein reportedly had an IQ of 160. Is Einstein s IQ score unusual? x mean z = = s. d = Since the z-score is higher than, we can conclude that Einstein s IQ score is unusual. Median Find the median of the following 9 numbers: a)65 b)64 c)67 d) Median For the data in the previous question, Suppose that the last data point is actually 115 instead of 85. What effect would this new maximum have on our value for the median of the dataset? a)increase the value of the median. b)decrease the value of the median. c) Not change the value of the median. Mean For the data in the previous question, Suppose that the last data point is actually 115 instead of 85. What effect would this new maximum have on our value for the mean of the dataset? a) Increase the value of the mean. b) Decrease the value of the mean. c) Not change the value of the mean

11 Mean vs. median For the dataset volumes of milk dispensed into -gallon milk cartons, should you use the mean or the median to describe the center? a) Mean b) Median Mean vs. median For the dataset sales prices of homes in Los Angeles, should you use the mean or the median to describe the center? a) Mean b) Median 61 6 Mean vs. median For the dataset incomes for people in the United States, should you use the mean or the median to describe the center? a) Mean b) Median Boxplots You have a boxplotfor the tar content of 5 different cigarettes. What is a plausible set of values for the five-number summary? a) Min = 13, Q1 = 10, Median = 1.6, Q3 = 14, Max = 15 b) Min = 1, Q1 = 8.5, Median = 1.6, Q3 = 15, Max = 17 c) Min = 1, Q1 = 8.5, Median = 11.5, Q3 = 13, Max = 15 d) Min = 8.5, Q1 = 10, Median = 11.5, Q3 = 15, Max = Boxplots The shape of the boxplotbelow can be described as: a) Bi-modal b) Left-skewed c) Right-skewed d) Symmetric e) Uniform Side-by-side boxplots Look at the following side-by-side boxplots and compare the female and male shoulder girth. a) Females have a typically smaller shoulder girth than males. b) Females have a typically larger shoulder girth than males. c) Females and males have about the same shoulder girths

12 Side-by-side boxplots Look at the following side-by-side boxplotsand compare the female and male thigh girth. Comparing two histograms Compare the centers of Distr. A (Female Shoulder Girth) and Distr. B (Male Shoulder Girth) shown below. a) Females have a typically smaller thigh girth than males. b) Females have a typically larger thigh girth than males. c) Females and males have about the same thigh girth. 67 a) The center of Distr. A is greater than the center of Distr. B. b) The center of Distr. A is less than the center of Distr. B. c) The center of Distr. A is equal to the center of Distr. B. 68 Comparing two histograms Compare the spreads of Distr.A(Female Shoulder Girth) and Distr. B (Male Shoulder Girth) shown below. Boxplots What is the approximate range of the Male Wrist Girth dataset shown below? a) The spread of Distr. A is greater than the spread of Distr. B. b) The spread of Distr. A is less than the spread of Distr. B. c) The spread of Distr. A is equal to the spread of Distr. B. a) 14.5 to 19.5 b)16.5 to 17 c) 16.5 to 18 d)17 to 19.5 e) 14.5 to 16.5 and 18 to Boxplots What is the approximate interquartilerange of the Male Wrist Girth dataset shown below? Outliers If a dataset contains outliers, which measure of spread is resistant? a) 14.5 to 19.5 b) 16.5 to 17 c) 16.5 to 18 d) 17 to 19.5 e) 14.5 to 16.5 and 18 to 19.5 a) Range b) Interquartile range c) Standard deviation d) Variance

13 Standard deviation Which of the following statements is TRUE? a) Standard deviation has no unit of measurement. b) Standard deviation is either positive or negative. c) Standard deviation is inflated by outliers. d)standard deviation is used even when the mean is not an appropriate measure of center. Center and spread For the following distribution of major league baseball players salaries in 199, which measures of center and spread are more appropriate? a)mean and standard deviation b)median and interquartile range

Section3-2: Measures of Center

Section3-2: Measures of Center Chapter 3 Section3-: Measures of Center Notation Suppose we are making a series of observations, n of them, to be exact. Then we write x 1, x, x 3,K, x n as the values we observe. Thus n is the total number

More information

Some estimates of the height of the podium

Some estimates of the height of the podium Some estimates of the height of the podium 24 36 40 40 40 41 42 44 46 48 50 53 65 98 1 5 number summary Inter quartile range (IQR) range = max min 2 1.5 IQR outlier rule 3 make a boxplot 24 36 40 40 40

More information

STAT 113 Variability

STAT 113 Variability STAT 113 Variability Colin Reimer Dawson Oberlin College September 14, 2017 1 / 48 Outline Last Time: Shape and Center Variability Boxplots and the IQR Variance and Standard Deviaton Transformations 2

More information

The Range, the Inter Quartile Range (or IQR), and the Standard Deviation (which we usually denote by a lower case s).

The Range, the Inter Quartile Range (or IQR), and the Standard Deviation (which we usually denote by a lower case s). We will look the three common and useful measures of spread. The Range, the Inter Quartile Range (or IQR), and the Standard Deviation (which we usually denote by a lower case s). 1 Ameasure of the center

More information

Describing Data: One Quantitative Variable

Describing Data: One Quantitative Variable STAT 250 Dr. Kari Lock Morgan The Big Picture Describing Data: One Quantitative Variable Population Sampling SECTIONS 2.2, 2.3 One quantitative variable (2.2, 2.3) Statistical Inference Sample Descriptive

More information

Chapter 3. Descriptive Measures. Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 3, Slide 1

Chapter 3. Descriptive Measures. Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 3, Slide 1 Chapter 3 Descriptive Measures Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 3, Slide 1 Chapter 3 Descriptive Measures Mean, Median and Mode Copyright 2016, 2012, 2008 Pearson Education, Inc.

More information

Measures of Center. Mean. 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) Measure of Center. Notation. Mean

Measures of Center. Mean. 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) Measure of Center. Notation. Mean Measure of Center Measures of Center The value at the center or middle of a data set 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) 1 2 Mean Notation The measure of center obtained by adding the values

More information

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1 Chapter 3 Numerical Descriptive Measures Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1 Objectives In this chapter, you learn to: Describe the properties of central tendency, variation, and

More information

Lecture 2 Describing Data

Lecture 2 Describing Data Lecture 2 Describing Data Thais Paiva STA 111 - Summer 2013 Term II July 2, 2013 Lecture Plan 1 Types of data 2 Describing the data with plots 3 Summary statistics for central tendency and spread 4 Histograms

More information

1 Describing Distributions with numbers

1 Describing Distributions with numbers 1 Describing Distributions with numbers Only for quantitative variables!! 1.1 Describing the center of a data set The mean of a set of numerical observation is the familiar arithmetic average. To write

More information

appstats5.notebook September 07, 2016 Chapter 5

appstats5.notebook September 07, 2016 Chapter 5 Chapter 5 Describing Distributions Numerically Chapter 5 Objective: Students will be able to use statistics appropriate to the shape of the data distribution to compare of two or more different data sets.

More information

Lecture 1: Review and Exploratory Data Analysis (EDA)

Lecture 1: Review and Exploratory Data Analysis (EDA) Lecture 1: Review and Exploratory Data Analysis (EDA) Ani Manichaikul amanicha@jhsph.edu 16 April 2007 1 / 40 Course Information I Office hours For questions and help When? I ll announce this tomorrow

More information

2 Exploring Univariate Data

2 Exploring Univariate Data 2 Exploring Univariate Data A good picture is worth more than a thousand words! Having the data collected we examine them to get a feel for they main messages and any surprising features, before attempting

More information

NOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS

NOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS NOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS A box plot is a pictorial representation of the data and can be used to get a good idea and a clear picture about the distribution of the data. It shows

More information

Measures of Variation. Section 2-5. Dotplots of Waiting Times. Waiting Times of Bank Customers at Different Banks in minutes. Bank of Providence

Measures of Variation. Section 2-5. Dotplots of Waiting Times. Waiting Times of Bank Customers at Different Banks in minutes. Bank of Providence Measures of Variation Section -5 1 Waiting Times of Bank Customers at Different Banks in minutes Jefferson Valley Bank 6.5 6.6 6.7 6.8 7.1 7.3 7.4 Bank of Providence 4. 5.4 5.8 6. 6.7 8.5 9.3 10.0 Mean

More information

Description of Data I

Description of Data I Description of Data I (Summary and Variability measures) Objectives: Able to understand how to summarize the data Able to understand how to measure the variability of the data Able to use and interpret

More information

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1 1 Chapter 1 1.1 Definitions Stat 101 Exam 1 - Embers Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2.

More information

Handout 4 numerical descriptive measures part 2. Example 1. Variance and Standard Deviation for Grouped Data. mf N 535 = = 25

Handout 4 numerical descriptive measures part 2. Example 1. Variance and Standard Deviation for Grouped Data. mf N 535 = = 25 Handout 4 numerical descriptive measures part Calculating Mean for Grouped Data mf Mean for population data: µ mf Mean for sample data: x n where m is the midpoint and f is the frequency of a class. Example

More information

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics. Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics. Convergent validity: the degree to which results/evidence from different tests/sources, converge on the same conclusion.

More information

Frequency Distribution and Summary Statistics

Frequency Distribution and Summary Statistics Frequency Distribution and Summary Statistics Dongmei Li Department of Public Health Sciences Office of Public Health Studies University of Hawai i at Mānoa Outline 1. Stemplot 2. Frequency table 3. Summary

More information

DATA SUMMARIZATION AND VISUALIZATION

DATA SUMMARIZATION AND VISUALIZATION APPENDIX DATA SUMMARIZATION AND VISUALIZATION PART 1 SUMMARIZATION 1: BUILDING BLOCKS OF DATA ANALYSIS 294 PART 2 PART 3 PART 4 VISUALIZATION: GRAPHS AND TABLES FOR SUMMARIZING AND ORGANIZING DATA 296

More information

Ti 83/84. Descriptive Statistics for a List of Numbers

Ti 83/84. Descriptive Statistics for a List of Numbers Ti 83/84 Descriptive Statistics for a List of Numbers Quiz scores in a (fictitious) class were 10.5, 13.5, 8, 12, 11.3, 9, 9.5, 5, 15, 2.5, 10.5, 7, 11.5, 10, and 10.5. It s hard to get much of a sense

More information

Lecture Week 4 Inspecting Data: Distributions

Lecture Week 4 Inspecting Data: Distributions Lecture Week 4 Inspecting Data: Distributions Introduction to Research Methods & Statistics 2013 2014 Hemmo Smit So next week No lecture & workgroups But Practice Test on-line (BB) Enter data for your

More information

Chapter 2: Descriptive Statistics. Mean (Arithmetic Mean): Found by adding the data values and dividing the total by the number of data.

Chapter 2: Descriptive Statistics. Mean (Arithmetic Mean): Found by adding the data values and dividing the total by the number of data. -3: Measure of Central Tendency Chapter : Descriptive Statistics The value at the center or middle of a data set. It is a tool for analyzing data. Part 1: Basic concepts of Measures of Center Ex. Data

More information

3.1 Measures of Central Tendency

3.1 Measures of Central Tendency 3.1 Measures of Central Tendency n Summation Notation x i or x Sum observation on the variable that appears to the right of the summation symbol. Example 1 Suppose the variable x i is used to represent

More information

Unit 2 Statistics of One Variable

Unit 2 Statistics of One Variable Unit 2 Statistics of One Variable Day 6 Summarizing Quantitative Data Summarizing Quantitative Data We have discussed how to display quantitative data in a histogram It is useful to be able to describe

More information

Putting Things Together Part 1

Putting Things Together Part 1 Putting Things Together Part 1 These exercise blend ideas from various graphs (histograms and boxplots), differing shapes of distributions, and values summarizing the data. Data for 1, 5, and 6 are in

More information

Empirical Rule (P148)

Empirical Rule (P148) Interpreting the Standard Deviation Numerical Descriptive Measures for Quantitative data III Dr. Tom Ilvento FREC 408 We can use the standard deviation to express the proportion of cases that might fall

More information

CHAPTER 2 Describing Data: Numerical

CHAPTER 2 Describing Data: Numerical CHAPTER Multiple-Choice Questions 1. A scatter plot can illustrate all of the following except: A) the median of each of the two variables B) the range of each of the two variables C) an indication of

More information

Descriptive Statistics

Descriptive Statistics Petra Petrovics Descriptive Statistics 2 nd seminar DESCRIPTIVE STATISTICS Definition: Descriptive statistics is concerned only with collecting and describing data Methods: - statistical tables and graphs

More information

SOLUTIONS TO THE LAB 1 ASSIGNMENT

SOLUTIONS TO THE LAB 1 ASSIGNMENT SOLUTIONS TO THE LAB 1 ASSIGNMENT Question 1 Excel produces the following histogram of pull strengths for the 100 resistors: 2 20 Histogram of Pull Strengths (lb) Frequency 1 10 0 9 61 63 6 67 69 71 73

More information

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment Math 2311 Bekki George bekki@math.uh.edu Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment Class webpage: http://www.math.uh.edu/~bekki/math2311.html Math 2311 Class

More information

Standard Deviation. Lecture 18 Section Robb T. Koether. Hampden-Sydney College. Mon, Sep 26, 2011

Standard Deviation. Lecture 18 Section Robb T. Koether. Hampden-Sydney College. Mon, Sep 26, 2011 Standard Deviation Lecture 18 Section 5.3.4 Robb T. Koether Hampden-Sydney College Mon, Sep 26, 2011 Robb T. Koether (Hampden-Sydney College) Standard Deviation Mon, Sep 26, 2011 1 / 42 Outline 1 Variability

More information

1.2 Describing Distributions with Numbers, Continued

1.2 Describing Distributions with Numbers, Continued 1.2 Describing Distributions with Numbers, Continued Ulrich Hoensch Thursday, September 6, 2012 Interquartile Range and 1.5 IQR Rule for Outliers The interquartile range IQR is the distance between the

More information

STATISTICAL DISTRIBUTIONS AND THE CALCULATOR

STATISTICAL DISTRIBUTIONS AND THE CALCULATOR STATISTICAL DISTRIBUTIONS AND THE CALCULATOR 1. Basic data sets a. Measures of Center - Mean ( ): average of all values. Characteristic: non-resistant is affected by skew and outliers. - Median: Either

More information

Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution

Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution PSY 464 Advanced Experimental Design Describing and Exploring Data The Normal Distribution 1 Overview/Outline Questions-problems? Exploring/Describing data Organizing/summarizing data Graphical presentations

More information

Numerical Measurements

Numerical Measurements El-Shorouk Academy Acad. Year : 2013 / 2014 Higher Institute for Computer & Information Technology Term : Second Year : Second Department of Computer Science Statistics & Probabilities Section # 3 umerical

More information

Chapter 3: Probability Distributions and Statistics

Chapter 3: Probability Distributions and Statistics Chapter 3: Probability Distributions and Statistics Section 3.-3.3 3. Random Variables and Histograms A is a rule that assigns precisely one real number to each outcome of an experiment. We usually denote

More information

Measure of Variation

Measure of Variation Measure of Variation Variation is the spread of a data set. The simplest measure is the range. Range the difference between the maximum and minimum data entries in the set. To find the range, the data

More information

Basic Procedure for Histograms

Basic Procedure for Histograms Basic Procedure for Histograms 1. Compute the range of observations (min. & max. value) 2. Choose an initial # of classes (most likely based on the range of values, try and find a number of classes that

More information

Numerical Descriptions of Data

Numerical Descriptions of Data Numerical Descriptions of Data Measures of Center Mean x = x i n Excel: = average ( ) Weighted mean x = (x i w i ) w i x = data values x i = i th data value w i = weight of the i th data value Median =

More information

22.2 Shape, Center, and Spread

22.2 Shape, Center, and Spread Name Class Date 22.2 Shape, Center, and Spread Essential Question: Which measures of center and spread are appropriate for a normal distribution, and which are appropriate for a skewed distribution? Eplore

More information

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model STAT 203 - Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model In Chapter 5, we introduced a few measures of center and spread, and discussed how the mean and standard deviation are good

More information

Edexcel past paper questions

Edexcel past paper questions Edexcel past paper questions Statistics 1 Chapters 2-4 (Discrete) Statistics 1 Chapters 2-4 (Discrete) Page 1 Stem and leaf diagram Stem-and-leaf diagrams are used to represent data in its original form.

More information

Standardized Data Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis

Standardized Data Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis Descriptive Statistics (Part 2) 4 Chapter Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis McGraw-Hill/Irwin Copyright 2009 by The McGraw-Hill Companies, Inc. Chebyshev s Theorem

More information

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model STAT 203 - Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model In Chapter 5, we introduced a few measures of center and spread, and discussed how the mean and standard deviation are good

More information

Math Take Home Quiz on Chapter 2

Math Take Home Quiz on Chapter 2 Math 116 - Take Home Quiz on Chapter 2 Show the calculations that lead to the answer. Due date: Tuesday June 6th Name Time your class meets Provide an appropriate response. 1) A newspaper surveyed its

More information

4. DESCRIPTIVE STATISTICS

4. DESCRIPTIVE STATISTICS 4. DESCRIPTIVE STATISTICS Descriptive Statistics is a body of techniques for summarizing and presenting the essential information in a data set. Eg: Here are daily high temperatures for Jan 16, 2009 in

More information

Averages and Variability. Aplia (week 3 Measures of Central Tendency) Measures of central tendency (averages)

Averages and Variability. Aplia (week 3 Measures of Central Tendency) Measures of central tendency (averages) Chapter 4 Averages and Variability Aplia (week 3 Measures of Central Tendency) Chapter 5 (omit 5.2, 5.6, 5.8, 5.9) Aplia (week 4 Measures of Variability) Measures of central tendency (averages) Measures

More information

Exploratory Data Analysis

Exploratory Data Analysis Exploratory Data Analysis Stemplots (or Stem-and-leaf plots) Stemplot and Boxplot T -- leading digits are called stems T -- final digits are called leaves STAT 74 Descriptive Statistics 2 Example: (number

More information

Mini-Lecture 3.1 Measures of Central Tendency

Mini-Lecture 3.1 Measures of Central Tendency Mini-Lecture 3.1 Measures of Central Tendency Objectives 1. Determine the arithmetic mean of a variable from raw data 2. Determine the median of a variable from raw data 3. Explain what it means for a

More information

NOTES: Chapter 4 Describing Data

NOTES: Chapter 4 Describing Data NOTES: Chapter 4 Describing Data Intro to Statistics COLYER Spring 2017 Student Name: Page 2 Section 4.1 ~ What is Average? Objective: In this section you will understand the difference between the three

More information

Lecture 18 Section Mon, Feb 16, 2009

Lecture 18 Section Mon, Feb 16, 2009 The s the Lecture 18 Section 5.3.4 Hampden-Sydney College Mon, Feb 16, 2009 Outline The s the 1 2 3 The 4 s 5 the 6 The s the Exercise 5.12, page 333. The five-number summary for the distribution of income

More information

Math146 - Chapter 3 Handouts. The Greek Alphabet. Source: Page 1 of 39

Math146 - Chapter 3 Handouts. The Greek Alphabet. Source:   Page 1 of 39 Source: www.mathwords.com The Greek Alphabet Page 1 of 39 Some Miscellaneous Tips on Calculations Examples: Round to the nearest thousandth 0.92431 0.75693 CAUTION! Do not truncate numbers! Example: 1

More information

Lecture 18 Section Mon, Sep 29, 2008

Lecture 18 Section Mon, Sep 29, 2008 The s the Lecture 18 Section 5.3.4 Hampden-Sydney College Mon, Sep 29, 2008 Outline The s the 1 2 3 The 4 s 5 the 6 The s the Exercise 5.12, page 333. The five-number summary for the distribution of income

More information

Math 140 Introductory Statistics. First midterm September

Math 140 Introductory Statistics. First midterm September Math 140 Introductory Statistics First midterm September 23 2010 Box Plots Graphical display of 5 number summary Q1, Q2 (median), Q3, max, min Outliers If a value is more than 1.5 times the IQR from the

More information

1) What is the range of the data shown in the box and whisker plot? 2) True or False: 75% of the data falls between 6 and 12.

1) What is the range of the data shown in the box and whisker plot? 2) True or False: 75% of the data falls between 6 and 12. DO NOW 1) What is the range of the data shown in the box and whisker plot? 2) True or False: 75% of the data falls between 6 and 12. May 21 7:19 AM 1. 3 2. 2 3. 1 4. 4 5. 2 6. Min = 17, Q 1 = 18.5, Med

More information

Categorical. A general name for non-numerical data; the data is separated into categories of some kind.

Categorical. A general name for non-numerical data; the data is separated into categories of some kind. Chapter 5 Categorical A general name for non-numerical data; the data is separated into categories of some kind. Nominal data Categorical data with no implied order. Eg. Eye colours, favourite TV show,

More information

Putting Things Together Part 2

Putting Things Together Part 2 Frequency Putting Things Together Part These exercise blend ideas from various graphs (histograms and boxplots), differing shapes of distributions, and values summarizing the data. Data for, and are in

More information

9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives

9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives Basic Statistics for the Healthcare Professional 1 F R A N K C O H E N, M B B, M P A D I R E C T O R O F A N A L Y T I C S D O C T O R S M A N A G E M E N T, LLC Purpose of Statistic 2 Provide a numerical

More information

Today s plan: Section 4.1.4: Dispersion: Five-Number summary and Standard Deviation.

Today s plan: Section 4.1.4: Dispersion: Five-Number summary and Standard Deviation. 1 Today s plan: Section 4.1.4: Dispersion: Five-Number summary and Standard Deviation. 2 Once we know the central location of a data set, we want to know how close things are to the center. 2 Once we know

More information

Measures of Central Tendency Lecture 5 22 February 2006 R. Ryznar

Measures of Central Tendency Lecture 5 22 February 2006 R. Ryznar Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar Today s Content Wrap-up from yesterday Frequency Distributions The Mean, Median and Mode Levels of Measurement and Measures of Central

More information

Measures of Dispersion (Range, standard deviation, standard error) Introduction

Measures of Dispersion (Range, standard deviation, standard error) Introduction Measures of Dispersion (Range, standard deviation, standard error) Introduction We have already learnt that frequency distribution table gives a rough idea of the distribution of the variables in a sample

More information

Some Characteristics of Data

Some Characteristics of Data Some Characteristics of Data Not all data is the same, and depending on some characteristics of a particular dataset, there are some limitations as to what can and cannot be done with that data. Some key

More information

Summarising Data. Summarising Data. Examples of Types of Data. Types of Data

Summarising Data. Summarising Data. Examples of Types of Data. Types of Data Summarising Data Summarising Data Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester Today we will consider Different types of data Appropriate ways to summarise these data 17/10/2017

More information

Chapter 3 Descriptive Statistics: Numerical Measures Part A

Chapter 3 Descriptive Statistics: Numerical Measures Part A Slides Prepared by JOHN S. LOUCKS St. Edward s University Slide 1 Chapter 3 Descriptive Statistics: Numerical Measures Part A Measures of Location Measures of Variability Slide Measures of Location Mean

More information

Computing Statistics ID1050 Quantitative & Qualitative Reasoning

Computing Statistics ID1050 Quantitative & Qualitative Reasoning Computing Statistics ID1050 Quantitative & Qualitative Reasoning Single-variable Statistics We will be considering six statistics of a data set Three measures of the middle Mean, median, and mode Two measures

More information

Numerical Descriptive Measures. Measures of Center: Mean and Median

Numerical Descriptive Measures. Measures of Center: Mean and Median Steve Sawin Statistics Numerical Descriptive Measures Having seen the shape of a distribution by looking at the histogram, the two most obvious questions to ask about the specific distribution is where

More information

Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need.

Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need. Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need. For exams (MD1, MD2, and Final): You may bring one 8.5 by 11 sheet of

More information

Section Distributions of Random Variables

Section Distributions of Random Variables Section 8.1 - Distributions of Random Variables Definition: A random variable is a rule that assigns a number to each outcome of an experiment. Example 1: Suppose we toss a coin three times. Then we could

More information

Statistics vs. statistics

Statistics vs. statistics Statistics vs. statistics Question: What is Statistics (with a capital S)? Definition: Statistics is the science of collecting, organizing, summarizing and interpreting data. Note: There are 2 main ways

More information

Statistics I Chapter 2: Analysis of univariate data

Statistics I Chapter 2: Analysis of univariate data Statistics I Chapter 2: Analysis of univariate data Numerical summary Central tendency Location Spread Form mean quartiles range coeff. asymmetry median percentiles interquartile range coeff. kurtosis

More information

Percentiles, STATA, Box Plots, Standardizing, and Other Transformations

Percentiles, STATA, Box Plots, Standardizing, and Other Transformations Percentiles, STATA, Box Plots, Standardizing, and Other Transformations Lecture 3 Reading: Sections 5.7 54 Remember, when you finish a chapter make sure not to miss the last couple of boxes: What Can Go

More information

We will also use this topic to help you see how the standard deviation might be useful for distributions which are normally distributed.

We will also use this topic to help you see how the standard deviation might be useful for distributions which are normally distributed. We will discuss the normal distribution in greater detail in our unit on probability. However, as it is often of use to use exploratory data analysis to determine if the sample seems reasonably normally

More information

FINALS REVIEW BELL RINGER. Simplify the following expressions without using your calculator. 1) 6 2/3 + 1/2 2) 2 * 3(1/2 3/5) 3) 5/ /2 4

FINALS REVIEW BELL RINGER. Simplify the following expressions without using your calculator. 1) 6 2/3 + 1/2 2) 2 * 3(1/2 3/5) 3) 5/ /2 4 FINALS REVIEW BELL RINGER Simplify the following expressions without using your calculator. 1) 6 2/3 + 1/2 2) 2 * 3(1/2 3/5) 3) 5/3 + 7 + 1/2 4 4) 3 + 4 ( 7) + 3 + 4 ( 2) 1) 36/6 4/6 + 3/6 32/6 + 3/6 35/6

More information

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes. Standardizing normal distributions The Standard Normal Curve

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes. Standardizing normal distributions The Standard Normal Curve 6.1 6.2 The Standard Normal Curve Standardizing normal distributions The "bell-shaped" curve, or normal curve, is a probability distribution that describes many reallife situations. Basic Properties 1.

More information

Normal Model (Part 1)

Normal Model (Part 1) Normal Model (Part 1) Formulas New Vocabulary The Standard Deviation as a Ruler The trick in comparing very different-looking values is to use standard deviations as our rulers. The standard deviation

More information

DATA HANDLING Five-Number Summary

DATA HANDLING Five-Number Summary DATA HANDLING Five-Number Summary The five-number summary consists of the minimum and maximum values, the median, and the upper and lower quartiles. The minimum and the maximum are the smallest and greatest

More information

Variance, Standard Deviation Counting Techniques

Variance, Standard Deviation Counting Techniques Variance, Standard Deviation Counting Techniques Section 1.3 & 2.1 Cathy Poliak, Ph.D. cathy@math.uh.edu Department of Mathematics University of Houston 1 / 52 Outline 1 Quartiles 2 The 1.5IQR Rule 3 Understanding

More information

Week 7. Texas A& M University. Department of Mathematics Texas A& M University, College Station Section 3.2, 3.3 and 3.4

Week 7. Texas A& M University. Department of Mathematics Texas A& M University, College Station Section 3.2, 3.3 and 3.4 Week 7 Oğuz Gezmiş Texas A& M University Department of Mathematics Texas A& M University, College Station Section 3.2, 3.3 and 3.4 Oğuz Gezmiş (TAMU) Topics in Contemporary Mathematics II Week7 1 / 19

More information

Descriptive Analysis

Descriptive Analysis Descriptive Analysis HERTANTO WAHYU SUBAGIO Univariate Analysis Univariate analysis involves the examination across cases of one variable at a time. There are three major characteristics of a single variable

More information

Chapter 6. y y. Standardizing with z-scores. Standardizing with z-scores (cont.)

Chapter 6. y y. Standardizing with z-scores. Standardizing with z-scores (cont.) Starter Ch. 6: A z-score Analysis Starter Ch. 6 Your Statistics teacher has announced that the lower of your two tests will be dropped. You got a 90 on test 1 and an 85 on test 2. You re all set to drop

More information

Name PID Section # (enrolled)

Name PID Section # (enrolled) STT 200 -Lecture 2 Instructor: Aylin ALIN 02/19/2014 Midterm # 1 A Name PID Section # (enrolled) * The exam is closed book and 80 minutes. * You may use a calculator and the formula sheet that you brought

More information

DESCRIPTIVE STATISTICS II. Sorana D. Bolboacă

DESCRIPTIVE STATISTICS II. Sorana D. Bolboacă DESCRIPTIVE STATISTICS II Sorana D. Bolboacă OUTLINE Measures of centrality Measures of spread Measures of symmetry Measures of localization Mainly applied on quantitative variables 2 DESCRIPTIVE STATISTICS

More information

Distributions and their Characteristics

Distributions and their Characteristics Distributions and their Characteristics 1. A distribution of a variable is merely a list of all values the variable can take, and the corresponding frequencies. Distributions may be represented or displayed

More information

Wk 2 Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12)

Wk 2 Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Wk 2 Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics: - Measures of centrality (Mean, median, mode, trimmed mean) - Measures of spread (MAD, Standard deviation, variance) -

More information

Chapter 4 Variability

Chapter 4 Variability Chapter 4 Variability PowerPoint Lecture Slides Essentials of Statistics for the Behavioral Sciences Seventh Edition by Frederick J Gravetter and Larry B. Wallnau Chapter 4 Learning Outcomes 1 2 3 4 5

More information

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Lecture - 05 Normal Distribution So far we have looked at discrete distributions

More information

Introduction to Computational Finance and Financial Econometrics Descriptive Statistics

Introduction to Computational Finance and Financial Econometrics Descriptive Statistics You can t see this text! Introduction to Computational Finance and Financial Econometrics Descriptive Statistics Eric Zivot Summer 2015 Eric Zivot (Copyright 2015) Descriptive Statistics 1 / 28 Outline

More information

Simple Descriptive Statistics

Simple Descriptive Statistics Simple Descriptive Statistics These are ways to summarize a data set quickly and accurately The most common way of describing a variable distribution is in terms of two of its properties: Central tendency

More information

Test Bank Elementary Statistics 2nd Edition William Navidi

Test Bank Elementary Statistics 2nd Edition William Navidi Test Bank Elementary Statistics 2nd Edition William Navidi Completed downloadable package TEST BANK for Elementary Statistics 2nd Edition by William Navidi, Barry Monk: https://testbankreal.com/download/elementary-statistics-2nd-edition-test-banknavidi-monk/

More information

2CORE. Summarising numerical data: the median, range, IQR and box plots

2CORE. Summarising numerical data: the median, range, IQR and box plots C H A P T E R 2CORE Summarising numerical data: the median, range, IQR and box plots How can we describe a distribution with just one or two statistics? What is the median, how is it calculated and what

More information

MEASURES OF DISPERSION, RELATIVE STANDING AND SHAPE. Dr. Bijaya Bhusan Nanda,

MEASURES OF DISPERSION, RELATIVE STANDING AND SHAPE. Dr. Bijaya Bhusan Nanda, MEASURES OF DISPERSION, RELATIVE STANDING AND SHAPE Dr. Bijaya Bhusan Nanda, CONTENTS What is measures of dispersion? Why measures of dispersion? How measures of dispersions are calculated? Range Quartile

More information

ECON 214 Elements of Statistics for Economists

ECON 214 Elements of Statistics for Economists ECON 214 Elements of Statistics for Economists Session 3 Presentation of Data: Numerical Summary Measures Part 2 Lecturer: Dr. Bernardin Senadza, Dept. of Economics Contact Information: bsenadza@ug.edu.gh

More information

How Wealthy Are Europeans?

How Wealthy Are Europeans? How Wealthy Are Europeans? Grades: 7, 8, 11, 12 (course specific) Description: Organization of data of to examine measures of spread and measures of central tendency in examination of Gross Domestic Product

More information

Descriptive Statistics

Descriptive Statistics Chapter 3 Descriptive Statistics Chapter 2 presented graphical techniques for organizing and displaying data. Even though such graphical techniques allow the researcher to make some general observations

More information

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences. STAB22H3 Statistics I Duration: 1 hour and 45 minutes

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences. STAB22H3 Statistics I Duration: 1 hour and 45 minutes UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences STAB22H3 Statistics I Duration: 1 hour and 45 minutes Last Name: First Name: Student number: Aids allowed: - One handwritten

More information

Section Distributions of Random Variables

Section Distributions of Random Variables Section 8.1 - Distributions of Random Variables Definition: A random variable is a rule that assigns a number to each outcome of an experiment. Example 1: Suppose we toss a coin three times. Then we could

More information

The Not-So-Geeky World of Statistics

The Not-So-Geeky World of Statistics FEBRUARY 3 5, 2015 / THE HILTON NEW YORK The Not-So-Geeky World of Statistics Chris Emerson Chris Sweet (a/k/a Chris 2 ) 2 Who We Are Chris Sweet JPMorgan Chase VP, Outside Counsel & Engagement Management

More information

NCSS Statistical Software. Reference Intervals

NCSS Statistical Software. Reference Intervals Chapter 586 Introduction A reference interval contains the middle 95% of measurements of a substance from a healthy population. It is a type of prediction interval. This procedure calculates one-, and

More information