MA 1125 Lecture 05 - Measures of Spread. Wednesday, September 6, Objectives: Introduce variance, standard deviation, range.

Similar documents
Statistics vs. statistics

MA 1125 Lecture 12 - Mean and Standard Deviation for the Binomial Distribution. Objectives: Mean and standard deviation for the binomial distribution.

MA 1125 Lecture 14 - Expected Values. Wednesday, October 4, Objectives: Introduce expected values.

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

1. Confidence Intervals (cont.)

Numerical Descriptive Measures. Measures of Center: Mean and Median

Chapter 6. y y. Standardizing with z-scores. Standardizing with z-scores (cont.)

Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need.

Chapter 4 Variability

MA 1125 Lecture 18 - Normal Approximations to Binomial Distributions. Objectives: Compute probabilities for a binomial as a normal distribution.

Lecture 9. Probability Distributions. Outline. Outline

IOP 201-Q (Industrial Psychological Research) Tutorial 5

Lecture 9. Probability Distributions

Vertical Asymptotes. We generally see vertical asymptotes in the graph of a function when we divide by zero. For example, in the function

CSC Advanced Scientific Programming, Spring Descriptive Statistics

Measures of Variation. Section 2-5. Dotplots of Waiting Times. Waiting Times of Bank Customers at Different Banks in minutes. Bank of Providence

Descriptive Statistics: Measures of Central Tendency and Crosstabulation. 789mct_dispersion_asmp.pdf

Probability. An intro for calculus students P= Figure 1: A normal integral

Every data set has an average and a standard deviation, given by the following formulas,

Descriptive Statistics (Devore Chapter One)

Chapter 3 Descriptive Statistics: Numerical Measures Part A

ECO155L19.doc 1 OKAY SO WHAT WE WANT TO DO IS WE WANT TO DISTINGUISH BETWEEN NOMINAL AND REAL GROSS DOMESTIC PRODUCT. WE SORT OF

Midterm Test 1 (Sample) Student Name (PRINT):... Student Signature:... Use pencil, so that you can erase and rewrite if necessary.

19. CONFIDENCE INTERVALS FOR THE MEAN; KNOWN VARIANCE

Since his score is positive, he s above average. Since his score is not close to zero, his score is unusual.

L04: Homework Answer Key

We use probability distributions to represent the distribution of a discrete random variable.

Percents, Explained By Mr. Peralta and the Class of 622 and 623

Lecture 18 Section Mon, Feb 16, 2009

Lecture 18 Section Mon, Sep 29, 2008

Synthetic Positions. OptionsUniversity TM. Synthetic Positions

What s Normal? Chapter 8. Hitting the Curve. In This Chapter

Finance 197. Simple One-time Interest

Part 10: The Binomial Distribution

x-intercepts, asymptotes, and end behavior together

Basic Procedure for Histograms

2 DESCRIPTIVE STATISTICS

The Standard Deviation as a Ruler and the Normal Model. Copyright 2009 Pearson Education, Inc.

A CLEAR UNDERSTANDING OF THE INDUSTRY

THE UNIVERSITY OF TEXAS AT AUSTIN Department of Information, Risk, and Operations Management

Real Estate Private Equity Case Study 3 Opportunistic Pre-Sold Apartment Development: Waterfall Returns Schedule, Part 1: Tier 1 IRRs and Cash Flows

But suppose we want to find a particular value for y, at which the probability is, say, 0.90? In other words, we want to figure out the following:

Elementary Statistics

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1

Ti 83/84. Descriptive Statistics for a List of Numbers

David Tenenbaum GEOG 090 UNC-CH Spring 2005

The Normal Probability Distribution

HPM Module_2_Breakeven_Analysis

Law of Large Numbers, Central Limit Theorem

Chapter 5: Summarizing Data: Measures of Variation

STAB22 section 1.3 and Chapter 1 exercises

Chapter 5. Sampling Distributions

Measures of Center. Mean. 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) Measure of Center. Notation. Mean

In this example, we cover how to discuss a sell-side divestiture transaction in investment banking interviews.

Measure of Variation

Problem Set 6. I did this with figure; bar3(reshape(mean(rx),5,5) );ylabel( size ); xlabel( value ); mean mo return %

Numerical Descriptions of Data

CHAPTER 4 DISCRETE PROBABILITY DISTRIBUTIONS

EconS Utility. Eric Dunaway. Washington State University September 15, 2015

The figures in the left (debit) column are all either ASSETS or EXPENSES.

Stat 5303 (Oehlert): Power and Sample Size 1

Basic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 3: April 25, Abstract

Club Accounts - David Wilson Question 6.

P1: TIX/XYZ P2: ABC JWST JWST075-Goos June 6, :57 Printer Name: Yet to Come. A simple comparative experiment

6.041SC Probabilistic Systems Analysis and Applied Probability, Fall 2013 Transcript Lecture 23

3.1 Measures of Central Tendency

Discrete Probability Distribution

When we look at a random variable, such as Y, one of the first things we want to know, is what is it s distribution?

EconS Constrained Consumer Choice

The Two-Sample Independent Sample t Test

CH 5 Normal Probability Distributions Properties of the Normal Distribution

Class 11. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Probability and Stochastics for finance-ii Prof. Joydeep Dutta Department of Humanities and Social Sciences Indian Institute of Technology, Kanpur

Chapter 6 Confidence Intervals

The following content is provided under a Creative Commons license. Your support

BINARY OPTIONS: A SMARTER WAY TO TRADE THE WORLD'S MARKETS NADEX.COM

Finance 527: Lecture 31, Options V3

Estimating parameters 5.3 Confidence Intervals 5.4 Sample Variance

Purchase Price Allocation, Goodwill and Other Intangibles Creation & Asset Write-ups

HPM Module_6_Capital_Budgeting_Exercise

Math 124: Module 8 (Normal Distribution) Normally Distributed Random Variables. Solving Normal Problems with Technology

Mr M didn t think MBNA had offered enough compensation. He said it hadn t worked out his compensation in the way we d expect it to.

PROBABILITY AND STATISTICS CHAPTER 4 NOTES DISCRETE PROBABILITY DISTRIBUTIONS

Multiple regression - a brief introduction

Boom & Bust Monthly Insight Video: What the Media Won t Say About the ACA

Standard Deviation. Lecture 18 Section Robb T. Koether. Hampden-Sydney College. Mon, Sep 26, 2011

January 29. Annuities

Confidence Intervals for the Mean. When σ is known

Binomial Random Variable - The count X of successes in a binomial setting

Life Insurance Buyer s Guide

Interest Rates: Inflation and Loans

Chapter 12 Module 4. AMIS 310 Foundations of Accounting

The Assumptions of Bernoulli Trials. 1. Each trial results in one of two possible outcomes, denoted success (S) or failure (F ).

MidTerm 1) Find the following (round off to one decimal place):

Descriptive Statistics

Pre-Algebra, Unit 7: Percents Notes

STA Module 3B Discrete Random Variables

appstats5.notebook September 07, 2016 Chapter 5

When we look at a random variable, such as Y, one of the first things we want to know, is what is it s distribution?

For personal use only

Transcription:

MA 115 Lecture 05 - Measures of Spread Wednesday, September 6, 017 Objectives: Introduce variance, standard deviation, range. 1. Measures of Spread In Lecture 04, we looked at several measures of central tendency. In an attempt to describe the numbers in a set, we used the mean, the median, the mode, or the midrange, as single numbers that represented the entire data set. Each of these is a single number that represents all of the numbers. For example, if we know that the mean weight of a certain breed of dog is 17.5 pounds, then we know that we re talking about relatively small dogs. The mean doesn t tell us whether a dog of this breed weighing 5 pounds is unusual or not, however. I saw on the news that the median price of a house in the United States just went over $00,000. It s certainly clear that houses cost a lot now, but it does not tell us how many houses go for under $100,000, for example. Given a data set, knowing the mean, the median, the mode, or the midrange, gives us a good start on understanding how big the numbers in the set are. If we would like a little bit more information, knowing how spread out the numbers are would go a long way. Today, we ll look at some measures of spread.. The range I m mostly going to focus on the mean, but I d like to look at one of the other measures of central tendency first as an example alternative. We looked at the midrange last time, and that is the number that is halfway between the smallest and largest values in the data set. Suppose we want to build a garage, but we have no idea about whether we could afford it or not. To check this out, we ask for a number of quotes, and we got (1) $16,000 $,000 $17,000 $19,000 $1,000 $0,000 It s easy to compute the midrange, we just find the average of the smallest and largest numbers. () midrange = $16,000 + $,000 = $19,000. The range is the difference between the smallest and largest numbers. (3) range = $,000 $16, 000 = $6,000. 1

If someone were to ask us how much garages cost, we could say, Well according to my research, the midrange is about $19,000 and the range is about $6,000. If you would like this kind of garage, you could plan on spending fairly close to $19,000. 3. Deviations from the mean The midrange and range are easy to compute, but they can be easily misled. Of the measures of central tendency, the mean lends itself well to mathematical analysis, and there is a lot we can do with it. Most of the rest of the class will be devoted to extending the information given by the mean. Right now, we re interested in measures of spread. Since the mean is, in some sense, in the middle, we re going to look at how far all the other numbers are from the mean, or in fancy statistical language, at the deviations from the mean. If x represents a number in a sample, its deviation from the mean is (4) deviation from the mean = x x. In a population, we have a different symbol for the mean, so the deviation from the mean in a population is (5) deviation from the mean = x µ. If we know the mean for a sample and all the deviations from the mean, we can actually figure out all numbers in the data set. This can be a useful way of looking at the data set, but it s not really much simpler. We d like a single number that describes how big the deviations are. One idea is to take the average of all the deviations from the mean. That is, take the mean of the deviations from the mean. 4. The variance Whenever you take the average of the deviations from the mean, you will always get zero. As a result, the mean deviation from the mean is not a useful measure of spread. We could, if we wanted to, just make all the deviations from the mean positive by using absolute values. We could then take the average of these numbers. That s a great idea, but I ve never seen it used. I m not exactly sure why, but I know that absolute values can be awkward mathematically, and I think the way we re going to solve this problem contains more information.

MA 115 Lecture 05 - Measures of Spread 3 We need to get rid of the negative signs in the deviations from the mean somehow, and we re going to do that by squaring them. That may sound odd, but it ends up working quite well. To combine the deviations from the mean into a single number, we re going to do the following. We ll do it for a sample first, then for an entire population. Compute the deviations from the mean (6) x x. Square the deviations to make them positive (or zero) (7) (x x). Then we re going to find the mean for the deviations squared, which means add them up and divide by how many there are () s = (x x) n 1 Two things should look odd. First, the s. This is the symbol for the sample variance. Second, we re dividing by n 1 instead of n. Here s my explanation: The mean, on average, is one of the numbers in the data set, so one of the deviations from the mean is zero, on average, so we re really computing the average deviation for the other numbers. That may or may not make sense, but equation () is the standard formula for a sample variance. Let s compute the variance for the sample in problem 6. I strongly suggest working in a table, as I am going to demonstrate. We ll do this a lot. When we see a Σ in a formula, that will mean that we re going to add up a column in the table. OK. Our table starts off as follows. (9) 1 11 11

4 First we compute the mean. This entails computing x, so we ll sum over the first column. After that, we divide by n to get x. (10) 1 11 11 50 x = 50 5 = 10 Once we know the mean, we subtract it from all the x s as formula () tells us. (11) 1 11 1 11 1 50 x = 50 5 = 10 Next, we square all the deviations from the mean to make them positive. (1) 4 4 1 4 11 1 1 11 1 1 50 x = 50 5 = 10

MA 115 Lecture 05 - Measures of Spread 5 Finally, we sum over the deviations from the mean and divide by n 1, which is 5 1 = 4 in this case. (13) 4 4 1 4 11 1 1 11 1 1 50 14 x = 50 5 = 10 s = 14 4 = 3.5 The variance is s = 3.5. 5. The standard deviation The variance is a measure of spread. If you get a larger number for s, then this says that the data set is more spread out. More specifically, it s hard to tell what the number means exactly, however. We ll end up using a different number a little more often, but even this other measure of spread needs a lot of mathematical analysis to understand it well. Since we ve squared the deviations to compute the variance, the variance is not quite in line with the sizes of the individual deviations. To compensate, we ll mostly work with something called the standard deviation. The standard deviation, s, is simply the square root of the variance. (14) s = s. This formula looks a bit odd, but we compute the variance first, and then take the square root to get the standard deviation. For the set {,, 1, 11,11 }, we got a variance of s = 3.5. The standard deviation is the square root of this, so (15) s = s = 3.5 = 1.70693 1.7. 6. Quiz 05, Part I of I Find the standard deviation for the sample { 3, 5, 6, 6 }.

6 7. Population variance and standard deviation In our discussion about the variance and standard deviation, we ve only talked in terms of a sample. The variance and standard deviation for a population are computed in pretty much the same way. These are parameters, of course, and we will use the lowercase Greek letter sigma, σ, instead of the s. The population variance is (16) σ = (x µ). N The population mean µ is used here, but the variance σ is still the average of the deviations from the mean squared. We don t have N 1 in the denominator, just N. In practice, a population size N is going to be a really big number, so subtracting 1 doesn t matter much. The population standard deviation, σ, is again just the square root of the variance. (17) σ = σ.. Homework 05 For problems 1-3, suppose the numbers came out a little differently in our garage survey, and we got (1) $1,000 $5,000 $4,000 $5,000 $4,000 $5,000 1. What is the midrange?. What is the range? 3. Does the midrange and range describe the data set in problem 1 very well? (That is, are most of the numbers about the same as the midrange, and are most of the numbers as spread out as the range indicates?) For problems 4 and 5, the questions are general ones about the deviation from the mean. 4. If the deviation from the mean for a number x is positive, is x larger than the mean or smaller? 5. If the deviation from the mean for x is negative, is x larger than the mean or smaller?

MA 115 Lecture 05 - Measures of Spread 7 For problems 6-14, work with the sample {,, 6, 5, 7, 4, 3 }. You should put your numbers into a table that looks like (19) 6 5 7 4 3 6. Count the number of elements in this set. So n =? 7. Find x.. What is the deviation from the mean for x =? 9. What is the deviation from the mean for x =? 10. What is the deviation from the mean squared (i.e., what is (x x) ) for x =? 11. What is the deviation from the mean squared for x =? 1. What did you get for (x x)? 13. What is s? Round your answer correctly to two decimal places. 14. What is the standard deviation? Round correctly to two decimal places. For problems 15-17, work with the sample {, 5, 4, 5 }. 15. Find x. 16. Find s. 17. Find s. Round your answer correctly to two decimal places. 1. In general, what is σ a symbol for? The population 19. Is σ a statistic or a parameter? 0. Suppose a population variance is 3.05. What is σ? Round your answer correctly to two decimal places. Answers on next page

Quiz Answers (0) 3 4 5 0 0 6 1 1 6 1 1 0 6 x = 0 4 = 5 s = 6 3 = s = 1.41 That is, x = 5, s =, and s = 1.41. HW Answers 1) $1,000+$5,000 = $1,500. ) $5,000 $1,000 = $13,000. 3) The $1,000 distorts the range and midrange. The prices are mostly $4,000 or $5,000, and they don t vary very much. 4) Positive deviations from the mean go with x s that are larger than x. 5) Negative deviations go with x s that are smaller than x. 6) n = 7. 7) x = 5. ) 3 (corrected /3/14 at 1:30). 9) 3. 10) 9. 11) 9. 1) (x x) =. 13) s = 4.67. Note: You divide by 6! 14) s =.16. 15) x = 4.

MA 115 Lecture 05 - Measures of Spread 9 16) s = (Divide by 3.) 17) s = 1.41. 1) σ is the population variance. 19) σ, the population standard deviation is a parameter. 0) σ = 3.05 = 1.75.