The Standard Deviation as a Ruler and the Normal Model. Copyright 2009 Pearson Education, Inc.

Similar documents
Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need.

Chapter 6. y y. Standardizing with z-scores. Standardizing with z-scores (cont.)

Normal Model (Part 1)

8.2 The Standard Deviation as a Ruler Chapter 8 The Normal and Other Continuous Distributions 8-1

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

Math 243 Lecture Notes

c) Why do you think the two percentages don't agree? d) Create a histogram of these times. What do you see?

Math 140 Introductory Statistics. First midterm September

Chapter 5 The Standard Deviation as a Ruler and the Normal Model

The Normal Distribution

CHAPTER 2 Describing Data: Numerical

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment

Source: Fall 2015 Biostats 540 Exam I. BIOSTATS 540 Fall 2016 Practice Test for Unit 1 Summarizing Data Page 1 of 6

Shifting and rescaling data distributions

Applications of Data Dispersions

appstats5.notebook September 07, 2016 Chapter 5

Numerical Descriptive Measures. Measures of Center: Mean and Median

Section3-2: Measures of Center

STOR 155 Practice Midterm 1 Fall 2009

BIOL The Normal Distribution and the Central Limit Theorem

Copyright 2005 Pearson Education, Inc. Slide 6-1

1 Describing Distributions with numbers

Measures of Center. Mean. 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) Measure of Center. Notation. Mean

Sampling Distribution Models. Copyright 2009 Pearson Education, Inc.

Putting Things Together Part 2

Unit 2 Statistics of One Variable

STAB22 section 1.3 and Chapter 1 exercises

Unit2: Probabilityanddistributions. 3. Normal distribution

NOTES: Chapter 4 Describing Data

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1

Found under MATH NUM

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Numerical Descriptions of Data

Descriptive Statistics (Devore Chapter One)

The Range, the Inter Quartile Range (or IQR), and the Standard Deviation (which we usually denote by a lower case s).

MA 1125 Lecture 05 - Measures of Spread. Wednesday, September 6, Objectives: Introduce variance, standard deviation, range.

3.1 Measures of Central Tendency

Terms & Characteristics

Some estimates of the height of the podium

Simple Descriptive Statistics

Describing Data: One Quantitative Variable

Measures of Variation. Section 2-5. Dotplots of Waiting Times. Waiting Times of Bank Customers at Different Banks in minutes. Bank of Providence

Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution

Unit2: Probabilityanddistributions. 3. Normal and binomial distributions

Honors Statistics. 3. Discuss homework C2# Discuss standard scores and percentiles. Chapter 2 Section Review day 2016s Notes.

ECON 214 Elements of Statistics for Economists 2016/2017

Review. What is the probability of throwing two 6s in a row with a fair die? a) b) c) d) 0.333

CH 5 Normal Probability Distributions Properties of the Normal Distribution

Handout 4 numerical descriptive measures part 2. Example 1. Variance and Standard Deviation for Grouped Data. mf N 535 = = 25

Empirical Rule (P148)

Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc.

Chapter 3. Descriptive Measures. Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 3, Slide 1

The Normal Model The famous bell curve

Unit2: Probabilityanddistributions. 3. Normal and binomial distributions

Since his score is positive, he s above average. Since his score is not close to zero, his score is unusual.

STATISTICAL DISTRIBUTIONS AND THE CALCULATOR

A LEVEL MATHEMATICS ANSWERS AND MARKSCHEMES SUMMARY STATISTICS AND DIAGRAMS. 1. a) 45 B1 [1] b) 7 th value 37 M1 A1 [2]

ECON 214 Elements of Statistics for Economists

Chapter 2. Section 2.1

2 DESCRIPTIVE STATISTICS

Descriptive Statistics

Chapter Seven. The Normal Distribution

2 Exploring Univariate Data

Chapter 4. The Normal Distribution

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Math 140 Introductory Statistics

Density curves. (James Madison University) February 4, / 20

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences. STAB22H3 Statistics I Duration: 1 hour and 45 minutes

Standardized Data Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis

NOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS

LECTURE 6 DISTRIBUTIONS

Sampling Distributions

Announcements. Unit 2: Probability and distributions Lecture 3: Normal distribution. Normal distribution. Heights of males

1. Confidence Intervals (cont.)

Figure 1: 2πσ is said to have a normal distribution with mean µ and standard deviation σ. This is also denoted

Math 227 Elementary Statistics. Bluman 5 th edition

STAT 113 Variability

Lecture 2 Describing Data

Chapter 4 Variability

Lecture Week 4 Inspecting Data: Distributions

Lecture 1: Review and Exploratory Data Analysis (EDA)

1/12/2011. Chapter 5: z-scores: Location of Scores and Standardized Distributions. Introduction to z-scores. Introduction to z-scores cont.

22.2 Shape, Center, and Spread

Standard Normal Calculations

We will use an example which will result in a paired t test regarding the labor force participation rate for women in the 60 s and 70 s.

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

Data Analysis and Statistical Methods Statistics 651

Refer to Ex 3-18 on page Record the info for Brand A in a column. Allow 3 adjacent other columns to be added. Do the same for Brand B.

David Tenenbaum GEOG 090 UNC-CH Spring 2005

5.1 Mean, Median, & Mode

STAT 157 HW1 Solutions

AP * Statistics Review

AP Stats ~ Lesson 6B: Transforming and Combining Random variables

Some Characteristics of Data

6.2 Normal Distribution. Normal Distributions

MidTerm 1) Find the following (round off to one decimal place):

The Normal Probability Distribution

SOLUTIONS TO THE LAB 1 ASSIGNMENT

Transcription:

The Standard Deviation as a Ruler and the Normal Mol Copyright 2009 Pearson Education, Inc.

The trick in comparing very different-looking values is to use standard viations as our rulers. The standard viation tells us how the whole collection of values varies, so it s a natural ruler for comparing an individual to a group. As the most common measure of variation, the standard viation plays a crucial role in how we look at data. 2

We compare individual data values to their mean, relative to their standard viation using the following formula: z y y s We call the resulting values standardized values, noted as z. They can also be called z-scores. 3

a) Alex s score on a test was 84 points. The class average was 78 and the standard viation was 6 points. What was her z- score? a) The average score on a psych test was 65 points with a standard viation of 15 points. Joe s z-score was 2. How many points did he score? 4

z-scores measure the distance of each data value from the mean in standard viations. A negative z-score tells us that the data value is below the mean, while a positive z-score tells us that the data value is above the mean. 5

Standardized values have been converted from their original units to the standard statistical unit of standard viations from the mean. Thus, we can compare values that are measured on different scales, with different units, or from different populations. 6

Shifting data: Adding (or subtracting) a constant to every data value adds (or subtracts) the same constant to measures of position. Adding (or subtracting) a constant to each value will increase (or crease) measures of position: center, percentiles, max or min by the same constant. Its shape and spread - range, IQR, standard viation - remain unchanged. 7

Rescaling data: When we multiply (or divi) all the data values by any constant, all measures of position (such as the mean, median, and percentiles) and measures of spread (such as the range, the IQR, and the standard viation) are multiplied (or divid) by that same constant. 8

Men s weight data set measured weights in kilograms. If we want to think about these weights in pounds, we would rescale the data: 9

A company selling books on the Internet reports that the packages it ships have a median weight of 60 ounces and an IQR of 36 ounces. a) The company plans to inclu a catalog weighing 8 ounces in each package. What will the mew median and IQR be? b) If the company record the shipping weights of the packages with the sales flyers includ in pounds instead of ounces, what would the median and IQR be? 10

Standardizing data into z-scores shifts the data by subtracting the mean and rescales the values by dividing by their standard viation. Standardizing into z-scores does not change the shape of the distribution. Standardizing into z-scores changes the center by making the mean 0. Standardizing into z-scores changes the spread by making the standard viation 1. 11

Susan and Phil took math exams in separate classes. Susan scored 82 on her exam. Overall the stunt scores for the her class had a mean of 79 and a standard viation of 5. Phil scored 79 on his exam. Overall the stunt scores in his class had a mean of 70 and a standard viation of 12. Which stunt s overall performance was better? 12

A z-score gives us an indication of how unusual a value is because it tells us how far it is from the mean. Remember that a negative z-score tells us that the data value is below the mean, while a positive z-score tells us that the data value is above the mean. The larger a z-score is (negative or positive), the more unusual it is. 13

There is no universal standard for z-scores, but there is a mol that shows up over and over in Statistics. This mol is called the Normal mol (You may have heard of bell-shaped curves. ). Normal mols are appropriate for distributions whose shapes are unimodal and roughly symmetric. These distributions provi a measure of how extreme a z-score is. 14

The mean weight of a breed of dog is 45 pounds. Suppose that weight of all such animals can be scribed by a Normal mol with a standard viation of 5 pounds. How many standard viation from the mean would a dog weighing 38 pounds be? Which would be more unusual, a dog weighing 38 pounds or a dog weighing 55 pounds? 15

The mean weight of a watermelon is 68 ounces with a standard viation of 15 ounces. Suppose that weights of all such fruit can be scribes with a Normal mol. Watermelon buyers hope that watermelons will weigh at least 50 ounces. To see how much over or unr that goal the watermelons are, we could subtract 50 ounces from all the weights. What would the new mean and standard viation be? Suppose each watermelon sells for 65 cents per ounce. Find the mean and standard viation for the sale price of all the watermelons. 16

There is a Normal mol for every possible combination of mean and standard viation. We write N(μ,σ) to represent a Normal mol with a mean of μ and a standard viation of σ. We use Greek letters because this mean and standard viation do not come from data they are numbers (called parameters) that specify the mol. 17

Summaries of data, like the sample mean and standard viation, are written with Latin letters. Such summaries of data are called statistics. When we standardize Normal data, we still call the standardized value a z-score, and we write y z 18

Once we have standardized, we need only one mol: The N (0,1) mol is called the standard Normal mol (or the standard Normal distribution). Be careful don t use a Normal mol for just any data set, since standardizing does not change the shape of the distribution. 19

When we use the Normal mol, we are assuming the distribution is Normal. We cannot check this assumption in practice, so we check the following condition: Nearly Normal Condition: The shape of the data s distribution is unimodal and symmetric. This condition can be checked by making a histogram or a Normal probability plot (to be explained later). 20

Normal mols give us an ia of how extreme a value is by telling us how likely it is to find one that far from the mean. We can find these numbers precisely, but until then we will use a simple rule that tells us a lot about the Normal mol 21

It turns out that in a Normal mol: about 68% of the values fall within one standard viation of the mean; about 95% of the values fall within two standard viations of the mean; and, about 99.7% (almost all!) of the values fall within three standard viations of the mean. 22

The following shows what the 68-95-99.7 Rule tells us: 23

Make a picture. Make a picture. Make a picture. And, when we have data, make a histogram to check the Nearly Normal Condition to make sure we can use the Normal mol to mol the distribution. 24

The EPA fuel economy estimates for automobile mols tested recently predicted a mean of 24.8 mpg and a standard viation of 6.2 mpg for highway driving. Assume that a Normal mol can be applied. A) Draw a mol for the auto fuel economy. Clearly label it, sowing what the 68-96-99.7 Rule predicts about miles per gallon. B) In what interval would you expect the central 68% of the autos to be found? C) About what percent of autos should get more than 31 mpg? D) About what percent of cars should get between 31 and 37.2 mpg? E) Describe the gas mileage of the worst 2.5% of the cars. 25

When a data value doesn t fall exactly 1, 2, or 3 standard viations from the mean, we can look it up in a table of Normal percentiles. Table Z in Appendix D provis us with normal percentiles, but many calculators and statistics computer packages provi these as well. 26

Table Z is the standard Normal table. We have to convert our data to z-scores before using the table. The figure shows us how to find the area to the left when we have a z-score of 1.80: 27

The cholesterol levels of an adult can be scribed by a normal mol with a mean of 200 mg/dl and a standard viation of 15. What percent of adults do you expect to have cholesterol levels over 220 mg/dl? What percent of adults do you expect to have cholesterol levels between 175 and 185 mg/dl? Estimate the IQR of cholesterol levels. 28

Sometimes we start with areas and need to find the corresponding z-score or even the original data value. Example: What z-score represents the first quartile in a Normal mol? 29

In a Normal mol, what value(s) of z cut(s) off the region scribed? a) The highest 25% b) The highest 65% c) The lowest 65% d) The middle 90% 30

Look in Table Z for an area of 0.2500. The exact area is not there, but 0.2514 is pretty close. This figure is associated with z = -0.67, so the first quartile is 0.67 standard viations below the mean. 31

In the Normal mol for IQ scores N(100,16), what IQ Score bounds a) The highest 5% of all IQs? b) The lowest 30% of the IQs? c) The middle 80% of the IQs? 32

What percent of a standard Normal mol is found in each region? A) z > 2.1 B) z < 1.2 C) -0.35 < z < 2.05 D) z > 1.5 33

Only 10% of babies have learned to walk by the age of 9 months and 85% of babies are walking by 14 months of age. If the age at which babies velop the ability to walk can be scribed by a normal mol, find the parameters (mean and standard viation). 34

When you actually have your own data, you must check to see whether a Normal mol is reasonable. Looking at a histogram of the data is a good way to check that the unrlying distribution is roughly unimodal and symmetric. 35

A more specialized graphical display that can help you ci whether a Normal mol is appropriate is the Normal probability plot. If the distribution of the data is roughly Normal, the Normal probability plot approximates a diagonal straight line. Deviations from a straight line indicate that the distribution is not Normal. 36

Nearly Normal data have a histogram and a Normal probability plot that look somewhat like this example: 37

A skewed distribution might have a histogram and Normal probability plot like this: 38

Don t use a Normal mol when the distribution is not unimodal and symmetric. 39

Don t use the mean and standard viation when outliers are present the mean and standard viation can both be distorted by outliers. Don t round your results in the middle of a calculation. Don t worry about minor differences in results. 40

The story data can tell may be easier to unrstand after shifting or rescaling the data. Shifting data by adding or subtracting the same amount from each value affects measures of center and position but not measures of spread. Rescaling data by multiplying or dividing every value by a constant changes all the summary statistics center, position, and spread. 41

We ve learned the power of standardizing data. Standardizing uses the SD as a ruler to measure distance from the mean (z-scores). With z-scores, we can compare values from different distributions or values based on different units. z-scores can intify unusual or surprising values among data. 42

We ve learned that the 68-95-99.7 Rule can be a useful rule of thumb for unrstanding distributions: For data that are unimodal and symmetric, about 68% fall within 1 SD of the mean, 95% fall within 2 SDs of the mean, and 99.7% fall within 3 SDs of the mean. 43

We see the importance of Thinking about whether a method will work: Normality Assumption: We sometimes work with Normal tables (Table Z). These tables are based on the Normal mol. Data can t be exactly Normal, so we check the Nearly Normal Condition by making a histogram (is it unimodal, symmetric and free of outliers?) or a normal probability plot (is it straight enough?). 44