Mini-Lecture 3.1 Measures of Central Tendency

Similar documents
Mini-Lecture 7.1 Properties of the Normal Distribution

Section3-2: Measures of Center

Chapter 2: Descriptive Statistics. Mean (Arithmetic Mean): Found by adding the data values and dividing the total by the number of data.

1 Describing Distributions with numbers

Chapter 3. Descriptive Measures. Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 3, Slide 1

CHAPTER 2 Describing Data: Numerical

Numerical Descriptions of Data

Measures of Center. Mean. 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) Measure of Center. Notation. Mean

appstats5.notebook September 07, 2016 Chapter 5

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1

NOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS

Descriptive Statistics

Some estimates of the height of the podium

Describing Data: One Quantitative Variable

Percentiles, STATA, Box Plots, Standardizing, and Other Transformations

Source: Fall 2015 Biostats 540 Exam I. BIOSTATS 540 Fall 2016 Practice Test for Unit 1 Summarizing Data Page 1 of 6

Putting Things Together Part 1

3.1 Measures of Central Tendency

Putting Things Together Part 2

Applications of Data Dispersions

2 Exploring Univariate Data

Handout 4 numerical descriptive measures part 2. Example 1. Variance and Standard Deviation for Grouped Data. mf N 535 = = 25

STAT 113 Variability

Test Bank Elementary Statistics 2nd Edition William Navidi

Lecture Week 4 Inspecting Data: Distributions

3) Marital status of each member of a randomly selected group of adults is an example of what type of variable?

Lecture 1: Review and Exploratory Data Analysis (EDA)

Description of Data I

Empirical Rule (P148)

NOTES: Chapter 4 Describing Data

Standardized Data Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment

Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc.

Edexcel past paper questions

SOLUTIONS TO THE LAB 1 ASSIGNMENT

KING FAHD UNIVERSITY OF PETROLEUM & MINERALS DEPARTMENT OF MATHEMATICAL SCIENCES DHAHRAN, SAUDI ARABIA. Name: ID# Section

4. DESCRIPTIVE STATISTICS

Simple Descriptive Statistics

Lecture 2 Describing Data

Math146 - Chapter 3 Handouts. The Greek Alphabet. Source: Page 1 of 39

Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution

FINALS REVIEW BELL RINGER. Simplify the following expressions without using your calculator. 1) 6 2/3 + 1/2 2) 2 * 3(1/2 3/5) 3) 5/ /2 4

1) What is the range of the data shown in the box and whisker plot? 2) True or False: 75% of the data falls between 6 and 12.

DATA SUMMARIZATION AND VISUALIZATION

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences. STAB22H3 Statistics I Duration: 1 hour and 45 minutes

(a) salary of a bank executive (measured in dollars) quantitative. (c) SAT scores of students at Millersville University quantitative

Math Take Home Quiz on Chapter 2

Measures of Central Tendency Lecture 5 22 February 2006 R. Ryznar

Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need.

Tutorial Handout Statistics, CM-0128M Descriptive Statistics

Chapter 3 Descriptive Statistics: Numerical Measures Part A

Center and Spread. Measures of Center and Spread. Example: Mean. Mean: the balance point 2/22/2009. Describing Distributions with Numbers.

Numerical Measurements

Review: Chebyshev s Rule. Measures of Dispersion II. Review: Empirical Rule. Review: Empirical Rule. Auto Batteries Example, p 59.

Frequency Distribution and Summary Statistics

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1

22.2 Shape, Center, and Spread

Days Traveling Frequency Relative Frequency Percent Frequency % % 35 and above 1 Total %

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

MgtOp 215 TEST 1 (Golden) Spring 2016 Dr. Ahn. Read the following instructions very carefully before you start the test.

STATISTICAL DISTRIBUTIONS AND THE CALCULATOR

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MAT 1371 Midterm. This is a closed book examination. However one sheet is permitted. Only non-programmable and non-graphic calculators are permitted.

PSYCHOLOGICAL STATISTICS

A LEVEL MATHEMATICS ANSWERS AND MARKSCHEMES SUMMARY STATISTICS AND DIAGRAMS. 1. a) 45 B1 [1] b) 7 th value 37 M1 A1 [2]

The Range, the Inter Quartile Range (or IQR), and the Standard Deviation (which we usually denote by a lower case s).

The Central Limit Theorem: Homework

( ) P = = =

Math 140 Introductory Statistics. First midterm September

Stat 201: Business Statistics I Additional Exercises on Chapter Chapter 3

Name PID Section # (enrolled)

2 DESCRIPTIVE STATISTICS

The Central Limit Theorem: Homework

1 Exercise One. 1.1 Calculate the mean ROI. Note that the data is not grouped! Below you find the raw data in tabular form:

Descriptive Statistics

STA 248 H1S Winter 2008 Assignment 1 Solutions

SOLUTIONS: DESCRIPTIVE STATISTICS

Introduction to Computational Finance and Financial Econometrics Descriptive Statistics

Unit 2 Statistics of One Variable

Chapter 3: Displaying and Describing Quantitative Data Quiz A Name

DATA ANALYSIS EXAM QUESTIONS

1.2 Describing Distributions with Numbers, Continued

9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives

Edexcel past paper questions

Section 6-1 : Numerical Summaries

Monte Carlo Simulation (Random Number Generation)

The Central Limit Theorem: Homework

MATH FOR LIBERAL ARTS REVIEW 2

Misleading Graphs. Examples Compare unlike quantities Truncate the y-axis Improper scaling Chart Junk Impossible to interpret

Chapter 15: Sampling distributions

Some Characteristics of Data

Summarising Data. Summarising Data. Examples of Types of Data. Types of Data

Exploring Data and Graphics

Distributions and their Characteristics

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

CHAPTER 6. ' From the table the z value corresponding to this value Z = 1.96 or Z = 1.96 (d) P(Z >?) =

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

Standard Deviation. Lecture 18 Section Robb T. Koether. Hampden-Sydney College. Mon, Sep 26, 2011

Notes 12.8: Normal Distribution

1) 3 points Which of the following is NOT a measure of central tendency? a) Median b) Mode c) Mean d) Range

Transcription:

Mini-Lecture 3.1 Measures of Central Tendency Objectives 1. Determine the arithmetic mean of a variable from raw data 2. Determine the median of a variable from raw data 3. Explain what it means for a statistic to be resistant 4. Determine the mode of a variable from raw data Examples 1. Every six months, the United States Federal Reserve Board conducts a survey of credit card plans in the U.S. The following data are the interest rates charged by 10 credit card issuers randomly selected for the January 2008 survey. (Source: http://www.federalreserve.gov/pubs/shop/survey.htm) Institution Rate Pulaski Bank and Trust Company 6.5% Rainier Pacific Savings Bank 12.0% Wells Fargo Bank NA 14.4% Firstbank of Colorado 14.4% Lafayette Ambassador Bank 14.3% Infibank 13.0% United Bank, Inc. 13.3% First National Bank of The Mid-Cities 13.9% Bank of Louisiana 9.9% Bar Harbor Bank and Trust Company 14.5% a. Compute the mean, median, and mode of this sample of interest rates. (12.62%, 13.6%, 14.4%) b. Create a histogram of this data using a class width of 1% (See section 2.2.)

c. What proportion of the interest rates is less than the mean? (0.3) d. What proportion of the interest rates is less than the median? (0.5) e. Which statistic is least influenced by extreme observations, the mean or the median? (Median)

2. A group of Brigham Young University Idaho students (Matthew Herring, Nathan Spencer, Mark Walker, and Mark Steiner) collected data on the speed of vehicles traveling through a construction zone on a state highway, where the posted (construction) speed limit was 25 mph. The recorded speeds for 14 randomly selected vehicles are given below: 20, 24, 27, 28, 29, 30, 32, 33, 34, 36, 38, 39, 40, 40 a. Compute the mean, median, and mode of the observed speeds. (32.1; 32.5; 40) b. Make a histogram of this data using a class width of 5 mph.

Mini-Lecture 3.2 Measures of Dispersion Objectives 1. Compute the range of a variable from raw data 2. Compute the variance of a variable from raw data 3. Compute the standard deviation of a variable from raw data 4. Use the Empirical Rule to describe data that are bell shaped 5. Use Chebyshev s inequality to describe any set of data Examples 1. Every six months, the United States Federal Reserve Board conducts a survey of credit card plans in the U.S. The following data are the interest rates charged by 10 credit card issuers randomly selected for the January 2008 survey. (Source: http://www.federalreserve.gov/pubs/shop/survey.htm) Institution Rate Pulaski Bank and Trust Company 6.5% Rainier Pacific Savings Bank 12.0% Wells Fargo Bank NA 14.4% Firstbank of Colorado 14.4% Lafayette Ambassador Bank 14.3% Infibank 13.0% United Bank, Inc. 13.3% First National Bank of The Mid-Cities 13.9% Bank of Louisiana 9.9% Bar Harbor Bank and Trust Company 14.5% a. Compute the range, sample variance, and sample standard deviation for the interest rates. Express your answers rounded to the nearest tenth of a percent. (Range = 8.0%, s 2 = 6.7, s = 2.6%) b. On the basis of the histogram drawn in Section 3.1, comment on the appropriateness of using the Empirical Rule to make any general statements about interest rates. (The data do not appear to follow a symmetric or bell-shaped distribution. They appear to be skewed left, so it is not appropriate to use the Empirical Rule to make any general statements about interest rates in the U.S.) c. Does the interest rate for Pulaski Bank and Trust Company appear to be unusual? (Yes)

2. A group of Brigham Young University Idaho students (Matthew Herring, Nathan Spencer, Mark Walker, and Mark Steiner) collected data on the speed of vehicles traveling through a construction zone on a state highway, where the posted (construction) speed limit was 25 mph. The recorded speeds for 14 randomly selected vehicles are given below: 20, 24, 27, 28, 29, 30, 32, 33, 34, 36, 38, 39, 40, 40 a. Determine the sample standard deviation of the speeds. Express your answer rounded to the nearest mile per hour. (s = 6 mph) b. On the basis of the histogram drawn in Section 3.1, comment on the appropriateness of using the Empirical Rule to make any general statements about drivers speeds. (The data appear to roughly follow a symmetric, bell-shaped distribution. It is appropriate to apply the Empirical Rule.) c. Use the Empirical Rule to estimate the percentage of speeds that are between 26 and 38 mph. (Hint: the sample mean is approximately 32 mph.) (68%) d. Determine the actual percentage of drivers whose speeds were between 26 and 38 mph. (64%) e. Use the Empirical Rule to determine the percentage of drivers whose speeds are 26 mph or higher, so they exceed the posted speed limit. (84%) f. Determine the actual percentage of drivers whose speeds are 26 mph or higher, so they exceed the posted speed limit. (86%)

Mini-Lecture 3.3 Measures of Central Tendency and Dispersion from Grouped Data Objectives 1. Approximate the mean of a variable from grouped data 2. Compute the weighted mean 3. Approximate the variance and standard deviation of a variable from grouped data Examples 1. The National Survey of Student Engagement is a survey that (among other things) asked first year students at liberal arts colleges how much time they spend preparing for class each week. The results from the 2007 survey are summarized below. (Source: http://nsse.iub.edu/nsse_2007_annual_report/docs/withhold/ NSSE_2007_Annual_Report.pdf ) Hours 1-5 6-10 11-15 16-20 21-25 26-30 31-35 Percentage 13% 25% 23% 18% 10% 6% 5% a. Approximate the mean and standard deviation for the number of hours students spend preparing for class each week. (14.75 hours; 8.1 hours) b. Draw a relative frequency histogram of the data. What is the shape of the distribution? (Right-skewed) 0.25 0.20 Relative Frequency 0.15 0.10 0.05 0.00 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 Hours Spent Studying

c. Can the Empirical Rule be used with this data? (No, the data are not symmetric) 2. A simple random sample of 89 two-year old Toyota Prius cars that are listed for sale was collected from www.cars.com. The advertised prices of the cars are summarized in the table below. Price 15,000-17,500-20,000-22,500-25,000-27,500-30,000-32,500- (in dollars) 17,499 19,999 22,499 24,999 27,499 29,999 32,499 34,999 Frequency 1 2 4 27 38 15 0 2 a. Find the approximate mean and standard deviation for the advertised prices of the cars. ($25,575.84; $2,711.61) b. Draw a relative frequency histogram of the data. What is the shape of the distribution? (Left skewed, although one could argue the data are symmetric) c. Can the Empirical Rule be used with this data? (No, if the data are deemed to be left skewed; if deemed symmetric, it is appropriate to use the empirical rule)

Mini-Lecture 3.4 Measures of Position Objectives 1. Determine and interpret z-scores 2. Interpret percentiles 3. Determine and interpret quartiles 4. Determine and interpret the interquartile range 5. Check a set of data for outliers Examples 1. The Graduate Record Examination (GRE) is a test required for admission to many U.S. graduate schools. The University of Pittsburgh Graduate School of Public Health requires a GRE score no less than the 70 th percentile for admission into their Human Genetics MPH or MS program. (Source: http://www.publichealth.pitt.edu/interior.php?pageid=101.) Interpret this admissions requirement. (In order to be admitted to this program, an applicant must score as high or higher than 70% of the people who take the GRE.) 2. Every six months, the United States Federal Reserve Board conducts a survey of credit card plans in the U.S. The following data are the interest rates charged by 10 credit card issuers randomly selected for the January 2008 survey. The population mean credit card interest rate in January 2008 was 13.5%, and the population standard deviation was 3.2%. (Source: http://www.federalreserve.gov/pubs/shop/survey.htm) Institution Rate Pulaski Bank and Trust Company 6.5% Rainier Pacific Savings Bank 12.0% Wells Fargo Bank NA 14.4% Firstbank of Colorado 14.4% Lafayette Ambassador Bank 14.3% Infibank 13.0% United Bank, Inc. 13.3% First National Bank of The Mid-Cities 13.9% Bank of Louisiana 9.9% Bar Harbor Bank and Trust Company 14.5% a. Use the population mean and standard deviation to find the z-score corresponding to 6.5%, the rate offered by the Pulaski Bank and Trust Company. (z = 2.19)

b. Use the population mean and standard deviation to find the z-score corresponding to 14.5%, the rate offered by the Bar Harbor Bank and Trust Company. (z = 0.31) c. Find the first quartile of the interest rates. (12.0%) d. Find the interquartile range of the interest rates. (2.4%) 3. A group of Brigham Young University Idaho students (Matthew Herring, Nathan Spencer, Mark Walker, and Mark Steiner) collected data in November 2005 on the speed of vehicles traveling through a construction zone on a state highway, where the posted speed was 25 mph. The recorded speed of 14 randomly selected vehicles is given below: 20, 24, 27, 28, 29, 30, 32, 33, 34, 36, 38, 39, 40, 40. The sample mean and standard deviation for these vehicle speeds are 32.1 and 6.2, respectively. a. Find the z-score for a car driving 20 mph in this construction zone. ( 1.95) b. Find the z-score for a car driving 40 mph in this construction zone. (1.27) c. Determine the quartiles. (Q1 = 28 mph; Q2 = 32.5 mph; Q3 = 38 mph) d. Compute the interquartile range, IQR. (IQR = 10) e. Determine the lower and upper fences. Are there any outliers, according to this criterion? (Lower fence = 13 mph, Upper fence = 53 mph; there are no outliers according to this criterion)

Mini-Lecture 3.5 The Five-Number Summary and Boxplots Objectives 1. Compute the five-number summary 2. Draw and interpret boxplots Examples 1. Every six months, the United States Federal Reserve Board conducts a survey of credit card plans in the U.S. The following data are the interest rates charged by 10 credit card issuers randomly selected for the January 2008 survey. Based on this survey, the mean credit card interest rate in January 2008 was 13.5%, and the standard deviation of the national interest rates was 3.3% (Source: http://www.federalreserve.gov/pubs/shop/survey.htm) Institution Rate Pulaski Bank and Trust Company 6.5% Rainier Pacific Savings Bank 12.0% Wells Fargo Bank NA 14.4% Firstbank of Colorado 14.4% Lafayette Ambassador Bank 14.3% Infibank 13.0% United Bank, Inc. 13.3% First National Bank of The Mid-Cities 13.9% Bank of Louisiana 9.9% Bar Harbor Bank and Trust Company 14.5% a. Find the five-number summary for the observed credit card rates. (Min = 6.5%; Q1 = 12.0%; Median = 13.6%; Q3 = 14.4%; Max = 14.5%) b. Draw a boxplot for the observed credit card rates. c. Determine the shape of the distribution from the boxplot. Refer to the histogram drawn in Section 2.2 to test your answer. (Left-skewed) 2. A group of Brigham Young University Idaho students (Matthew Herring, Nathan Spencer, Mark Walker, and Mark Steiner) collected data in November

2005 on the speed of vehicles traveling through a construction zone on a state highway, where the posted speed was 25 mph. The recorded speed of 14 randomly selected vehicles is given below: 20, 24, 27, 28, 29, 30, 32, 33, 34, 36, 38, 39, 40, 40 a. Find the five-number summary for the observed speeds. (20, 27.5, 32, 38.5, 40) b. Construct a boxplot for the observed speeds. c. Determine the shape of the distribution from the boxplot. Refer to the histogram drawn in Section 2.2 to test your answer. (Symmetric or leftskewed)