NOTES: Chapter 4 Describing Data

Similar documents
Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need.

CHAPTER 2 Describing Data: Numerical

3) Marital status of each member of a randomly selected group of adults is an example of what type of variable?

Chapter 15: Sampling distributions

Math Take Home Quiz on Chapter 2

Dot Plot: A graph for displaying a set of data. Each numerical value is represented by a dot placed above a horizontal number line.

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment

SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.

Lecture 9. Probability Distributions. Outline. Outline

Lecture 9. Probability Distributions

Chapter 2: Descriptive Statistics. Mean (Arithmetic Mean): Found by adding the data values and dividing the total by the number of data.

Chapter 3. Descriptive Measures. Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 3, Slide 1

Math 227 Elementary Statistics. Bluman 5 th edition

Lecture 1: Review and Exploratory Data Analysis (EDA)

Section3-2: Measures of Center

Copyright 2005 Pearson Education, Inc. Slide 6-1

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences. STAB22H3 Statistics I Duration: 1 hour and 45 minutes

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

Example - Let X be the number of boys in a 4 child family. Find the probability distribution table:

Numerical Descriptions of Data

DATA SUMMARIZATION AND VISUALIZATION

22.2 Shape, Center, and Spread

Lecture 2 Describing Data

Describing Data: One Quantitative Variable

3.1 Measures of Central Tendency

Name PID Section # (enrolled)

Measures of Center. Mean. 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) Measure of Center. Notation. Mean

Section Distributions of Random Variables

9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives

CH 5 Normal Probability Distributions Properties of the Normal Distribution

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Chapter 6. y y. Standardizing with z-scores. Standardizing with z-scores (cont.)

appstats5.notebook September 07, 2016 Chapter 5

STAB22 section 1.3 and Chapter 1 exercises

Math 2200 Fall 2014, Exam 1 You may use any calculator. You may not use any cheat sheet.

Chapter 6: The Normal Distribution

Example - Let X be the number of boys in a 4 child family. Find the probability distribution table:

Frequency Distribution and Summary Statistics

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

The Central Limit Theorem: Homework

Shifting and rescaling data distributions

MATHEMATICS APPLIED TO BIOLOGICAL SCIENCES MVE PA 07. LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1)

The Standard Deviation as a Ruler and the Normal Model. Copyright 2009 Pearson Education, Inc.

Section Introduction to Normal Distributions

The Central Limit Theorem: Homework

AP Statistics Unit 1 (Chapters 1-6) Extra Practice: Part 1

Chapter 6: The Normal Distribution

Edexcel past paper questions

Mini-Lecture 3.1 Measures of Central Tendency

Example. Chapter 8 Probability Distributions and Statistics Section 8.1 Distributions of Random Variables

Continuous Probability Distributions

1 Describing Distributions with numbers

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes. Standardizing normal distributions The Standard Normal Curve

The Central Limit Theorem: Homework

Math146 - Chapter 3 Handouts. The Greek Alphabet. Source: Page 1 of 39

1. In a statistics class with 136 students, the professor records how much money each

Section Distributions of Random Variables

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1

PSYCHOLOGICAL STATISTICS

DATA ANALYSIS EXAM QUESTIONS

6.1 Graphs of Normal Probability Distributions:

Edexcel past paper questions

We will also use this topic to help you see how the standard deviation might be useful for distributions which are normally distributed.

MATH 104 CHAPTER 5 page 1 NORMAL DISTRIBUTION

2011 Pearson Education, Inc

Review. What is the probability of throwing two 6s in a row with a fair die? a) b) c) d) 0.333

Distributions and their Characteristics

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1

STAT 113 Variability

The Range, the Inter Quartile Range (or IQR), and the Standard Deviation (which we usually denote by a lower case s).

Handout 4 numerical descriptive measures part 2. Example 1. Variance and Standard Deviation for Grouped Data. mf N 535 = = 25

FINALS REVIEW BELL RINGER. Simplify the following expressions without using your calculator. 1) 6 2/3 + 1/2 2) 2 * 3(1/2 3/5) 3) 5/ /2 4

STT 315 Practice Problems Chapter 3.7 and 4

STAT 157 HW1 Solutions

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

Lecture Week 4 Inspecting Data: Distributions

In a binomial experiment of n trials, where p = probability of success and q = probability of failure. mean variance standard deviation

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

Description of Data I

The "bell-shaped" curve, or normal curve, is a probability distribution that describes many real-life situations.

Measures of Variation. Section 2-5. Dotplots of Waiting Times. Waiting Times of Bank Customers at Different Banks in minutes. Bank of Providence

Descriptive Statistics (Devore Chapter One)

Chapter Seven. The Normal Distribution

AP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE

Exam 1 Review. 1) Identify the population being studied. The heights of 14 out of the 31 cucumber plants at Mr. Lonardo's greenhouse.

Chapter 3. Lecture 3 Sections

Test Bank Elementary Statistics 2nd Edition William Navidi

Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc.

Chapter 6. The Normal Probability Distributions

Central Limit Theorem: Homework

Stat 201: Business Statistics I Additional Exercises on Chapter Chapter 3

MAS1403. Quantitative Methods for Business Management. Semester 1, Module leader: Dr. David Walshaw

The Normal Distribution

MATH CALCULUS & STATISTICS/BUSN - PRACTICE EXAM #2 - SUMMER DR. DAVID BRIDGE

NOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS

Refer to Ex 3-18 on page Record the info for Brand A in a column. Allow 3 adjacent other columns to be added. Do the same for Brand B.

ECON 214 Elements of Statistics for Economists 2016/2017

2 2 In general, to find the median value of distribution, if there are n terms in the distribution the

Examples of continuous probability distributions: The normal and standard normal

Graphical and Tabular Methods in Descriptive Statistics. Descriptive Statistics

Transcription:

NOTES: Chapter 4 Describing Data Intro to Statistics COLYER Spring 2017 Student Name:

Page 2

Section 4.1 ~ What is Average? Objective: In this section you will understand the difference between the three most common measures of central tendency; mean, median, and mode. You will learn how each is affected by outliers and also when it is appropriate to use a weighted mean. Mean: Essential Questions: 1. In your own words, explain what mean, median, and mode are representing. 2. How is mean calculated? 3. How do you find the median of an odd number set? Of an even number set? 4. How do you find the mode of a data set? Will there always be a mode? 5. What measure of central tendency is affected by outliers? 6. What is the major advantage of using median as opposed to mean in some circumstances? 7. What is the rounding rule? 8. When is weighted mean used? How is it calculated? The mean is most commonly referred to as the and is found by the following formula: Ex. ~ The following values represent the weights of five wrestlers: 188 162 190 150 176 Find the mean weight of these five wrestlers. Median: The median is the number in a data set. To find the median, you must first arrange the values in ascending (or descending) order: A data set that has an odd number of values will have exactly value in the middle. Ex ~ Find the median of the data set: 4 3 6 3 6 A data set that has an even number of values will have values in the middle which means that the median will be in the middle of those two values (or the of those two values). Ex ~ Find the median of the data set: 4 3 6 3 6 5 Page 3

Mode: The mode is the value (or group of values) in a data set. A data set may have one mode, more than one mode, or no mode Ex ~ Find the mode in the data set: 4 3 6 10 6 3 6 Ex ~ Find the mode in the data set: 4 3 6 3 10 6 Ex ~ Find the mode in the data set 4 3 6 2 10 The mode is most commonly used for qualitative data at the level since neither the mean nor the median can be found for qualitative data Ex. ~ the brand of shoes each student is wearing in this class Can t determine the mean because Can t determine the median because Can determine the mode because Rounding Rule: Example 1: Eight grocery stores sell the PR energy bar for the following prices: $1.09 $1.29 $1.29 $1.35 $1.39 $1.49 $1.59 $1.79 Find the mean, median, and mode for these prices. Page 4

Outlier: An outlier does not affect the median or mode because An outlier does affect the mean because Ex. ~ The following values represent the contract offer received for five graduating college seniors in the NBA (zero means that they didn t receive an offer) $0 $0 $0 $0 $3,500,000 Comparison of mean, median, and mode Measure Definition Takes every value into account? Affected by Outliers? Advantages sum of all values MEAN Yes Yes total number of values MEDIAN MODE middle value most frequent value No (except for the ordering of every number) No No (since the an outlier won t fall in the middle of a data set) No (since an outlier won t occur most frequently) Commonly understood as the average; works well with many statistical methods When there are outliers finding the median may be more representative of an average than the mean Most appropriate for qualitative data at the nominal level Example 2: A track coach wants to determine an appropriate heart rate for her athletes during their workouts. She chooses five of her best runners and asks them to wear heart monitors during a workout. In the middle of the workout, she reads the following heart rates for the five athletes: 130, 135, 140, 145, and 325. Which is a better measure of the average in this case the mean or the median? Why? Page 5

Weighted mean: Formula for weighted mean: Ex. ~ Suppose your course grade is based on four tests and one final exam. Each test counts as 15% of your final grade and the final exam counts as 40% of your final grade. Your test scores are 75, 80, 84, and 88 and your final exam score is a 96. What is your final grade? Example 3: Each quarter grade is calculated with a 80% weight on tests and quizzes and a 20% weight on class work, homework, and class participation. Furthermore, your course grade is calculated with a 40% weight on quarter 1, a 40% weight on quarter 2, and a 20% weight on the final exam. Suppose that Joe s quiz/test average in quarter 1 was an 86% and his homework/class work/ class participation grade was 95%. In quarter 2 his quiz/test average was a 92% and his homework/class work/class participation grade was a 90%, and he scored an 80% on his final exam. What was Joe s final grade? Example 4: Each quarter grade is calculated with a 80% weight on tests and quizzes and a 20% weight on class work, homework, and class participation. Furthermore, your course grade is calculated with a 40% weight on quarter 3, a 40% weight on quarter 4, and a 20% weight on the final exam. If Amy got an 88% for quarter 3 and a 93% for quarter 4, what would she need to get on the final exam in order to receive at least a 90% for her final grade? Page 6

Section 4.2 ~ Shapes of Distributions Objective: In this section, you will learn how to describe the general shape of a distribution in terms of its number of modes, skewness, and variation. Overview In chapter 3 you learned how to graph distributions using a variety of methods including bar graphs, histograms, and line charts as well as many other types of graphs and charts. Although these visuals gave us a very detailed amount of information about the data set as a whole, sometimes it s more useful to just get a general idea of how the data is being distributed by simply describing the shape that it forms. This general shape is represented by a smooth curve that is ultimately an outline of the detailed visual display (bar graph, histogram, line chart, etc.) Here are 3 examples of how data can be generalized by using smooth curves: Number of Modes and the Shape of a Distribution Recall that a mode or modes of a data set are the values that occur most frequently. One simple way to describe the shape of a distribution is: When all the data values in a data set have the same frequency; i.e. {3, 6, 4, 9, 5}, there is no mode and therefore no peaks in the distribution. This type of distribution is known as: o Here is an example of a uniform distribution: Page 7

When there is one mode in a data set, there is peak and the distribution is therefore called a or distribution. o Here is an example of a single-peaked or unimodal distribution: Any peak in a distribution is considered a mode even though they may not be the same height therefore representing different frequencies When there are two modes in a data set, there are peaks and the distribution is therefore called a distribution. o Here is an example of a bimodal distribution: Notice that the peaks are different heights; this means that the data values had different frequencies, but nonetheless, they are both considered modes Page 8

When there are three modes in a data set, there are Peaks and the distribution is therefore called a distribution. o Here is an example of a trimodal distribution: Example 1~ How many modes would you expect for each of the following distributions? Why? Make a rough sketch for each distribution, with clearly labeled axes. a. Heights of 1,000 randomly selected adult women. b. Hours spent watching football on TV in January for 1,000 randomly selected adult Americans. Page 9

c. Weekly sales throughout the year at a retail clothing store for children. d. The number of people with particular last digits (0 through 9) in their Social Security numbers. Page 10

Symmetry or Skewness In addition to describing a distribution by the number of its modes, you can describe it based on its symmetry, or lack there of A distribution is symmetric if: o Here are 3 examples of symmetric distributions: This symmetric distribution that is bell shaped with a single peak is known as a. It is so important that chapter 5 will be devoted to it. If a distribution is not symmetric, then it is. A distribution is consider to be left-skewed (or negatively skewed) when: o Here is an example of a distribution that is left-skewed: Since the values are more spread out on the left side, it s left-skewed The mean and median of a distribution that is left-skewed will be less than the mode Page 11

A distribution is consider to be right-skewed (or positively skewed) when: Here is an example of a distribution that is right-skewed: Since the values are more spread out on the right side, it s right-skewed The mean and median of a distribution that is right-skewed will be greater than the mode (as illustrated in the diagram on the previous page) In a distribution that is symmetric, the mean, median, and mode will be equal: Example 2~ For each of the following situations, state whether you expect the distribution to be symmetric, left-skewed, or right-skewed. Explain. a. Heights of a sample of 100 women. b. Family income in the United States. c. Speeds of cars on a road where a visible patrol car is using radar to detect speeders. Page 12

Variation The last way to describe a distribution in general is by its variation The variation of a distribution is: There are 3 types of variations,,, and. Here are examples of each type of variation: Example 3~ How would you expect the variation to differ between times in the Olympic marathon and times in the New York City Marathon? Explain. Sketch a graph for each situation. Tell what type of variation each situation has. Page 13

Page 14

Section 4.3 ~ Measures of Variation Objective: In this section you will be able to understand and interpret the following common measures of variation: range, five-number summary, and standard deviation. Essential Questions: 1. What does a five-number summary consist of and briefly explain how each element is found. 2. What type of visual distribution can be used to represent a five-number summary? How is it constructed? 3. What is a percentile? Recall that the variation of a data set describes how widely data are spread out about the center of the data set (low, moderate, or high) Range: Ex. ~ the following data represents the wait time (in minutes) for 11 customers at two different banks Big Bank: 4.1 5.2 5.6 6.2 6.7 7.2 7.7 7.7 8.5 9.3 11.0 Best Bank: 6.6 6.7 6.7 6.9 7.1 7.2 7.3 7.4 7.7 7.8 7.8 The range for Big Bank is: The range for Best Bank is: Example 1: Consider the following data sets which represent the quiz scores for nine students. Which set has the greater range? Would you also say that this set has the greater variation? Quiz 1: 1 10 10 10 10 10 10 10 10 Quiz 2: 2 3 4 5 6 7 8 9 10 Page 15

Quartiles: Lower quartile (Q 1 ): Odd data set: 4.1 5.2 5.6 6.2 6.7 7.2 7.7 7.7 8.5 9.3 11.0 Even data set: 6.6 6.7 6.7 6.9 7.1 7.2 7.3 7.4 7.7 7.8 Middle quartile (Q 2 ): Odd data set: 4.1 5.2 5.6 6.2 6.7 7.2 7.7 7.7 8.5 9.3 11.0 Even data set: 6.6 6.7 6.7 6.9 7.1 7.2 7.3 7.4 7.7 7.8 Page 16

Upper quartile (Q 3 ): Odd data set: 4.1 5.2 5.6 6.2 6.7 7.2 7.7 7.7 8.5 9.3 11.0 Even data set: 6.6 6.7 6.7 6.9 7.1 7.2 7.3 7.4 7.7 7.8 Five-number summary: Ex. ~ Write the five-number summaries for the waiting times for Big Bank and Best Bank. Boxplot: Page 17

Steps to drawing a boxplot: 1. 2. 3. 4. Example 2: A bakery collected the following data about the number of loaves of fresh bread sold on each of 10 business days. Write a five-number summary and then make a boxplot to represent this data. State any skewness. 43 39 17 38 50 42 34 8 39 43 Percentile: nth percentile: Ex. ~ The 35th percentile of a data set is the value that separates the bottom 35% of data values from the top 65% If exam results stated that your exam score is in the 35th percentile that means that you scored than 35% of the people that took the exam and than 65% of the people that took the exam Page 18

If a data value lies between two percentiles it is often said to lie in the lower of the two percentiles. Ex. ~ If you score higher than 84.7% of all people taking a college entrance examination, it is said that you scored in the 84th percentile Percentile Formula: Ex. ~ What percentile is the lower value, Q 1, Q 2, Q 3, and the upper value in for Big Bank? Example 3: Refer to table 4.4 on p.168 to answer the following questions: a. What is the percentile for the data value of 592.79 ng/ml for smokers? b. What is the percentile for the data value of 61.33 ng/ml for nonsmokers? c. What values mark the 36th percentile in the data set? Page 19

Standard deviation: Steps to calculate the standard deviation: 1. 2. 3. 4. 5. 6. Standard Deviation Formula: standard deviation = 2 sum of (deviations from the mean) total number of data values - 1 OR Example 4: Find the standard deviation of the following data set. 1 5 6 8 11 Range rule of thumb: Ex. ~ Estimate the standard deviation of the data set used in the last example. 1 5 6 8 11 If you know the standard deviation of a data set, you can use the range rule of thumb to estimate the high and low values of a data set: Page 20

Example 5: Use the range rule of thumb to estimate the standard deviations for the waiting times at Big Bank and Best Bank. Compare the estimates to the actual values in example 4. Example 6: Studies of the gas mileage of a BMW under varying driving conditions show that it gets a mean of 22 miles per gallon with a standard deviation of 3 miles per gallon. Estimate the minimum and maximum typical gas mileage amounts that you can expect under ordinary driving conditions. Notes on Notation: Page 21