Mini-Lecture 3.1 Measures of Central Tendency Objectives 1. Determine the arithmetic mean of a variable from raw data 2. Determine the median of a variable from raw data 3. Explain what it means for a statistic to be resistant 4. Determine the mode of a variable from raw data Examples 1. Every six months, the United States Federal Reserve Board conducts a survey of credit card plans in the U.S. The following data are the interest rates charged by 10 credit card issuers randomly selected for the January 2008 survey. (Source: http://www.federalreserve.gov/pubs/shop/survey.htm) Institution Rate Pulaski Bank and Trust Company 6.5% Rainier Pacific Savings Bank 12.0% Wells Fargo Bank NA 14.4% Firstbank of Colorado 14.4% Lafayette Ambassador Bank 14.3% Infibank 13.0% United Bank, Inc. 13.3% First National Bank of The Mid-Cities 13.9% Bank of Louisiana 9.9% Bar Harbor Bank and Trust Company 14.5% a. Compute the mean, median, and mode of this sample of interest rates. (12.62%, 13.6%, 14.4%) b. Create a histogram of this data using a class width of 1% (See section 2.2.)
c. What proportion of the interest rates is less than the mean? (0.3) d. What proportion of the interest rates is less than the median? (0.5) e. Which statistic is least influenced by extreme observations, the mean or the median? (Median)
2. A group of Brigham Young University Idaho students (Matthew Herring, Nathan Spencer, Mark Walker, and Mark Steiner) collected data on the speed of vehicles traveling through a construction zone on a state highway, where the posted (construction) speed limit was 25 mph. The recorded speeds for 14 randomly selected vehicles are given below: 20, 24, 27, 28, 29, 30, 32, 33, 34, 36, 38, 39, 40, 40 a. Compute the mean, median, and mode of the observed speeds. (32.1; 32.5; 40) b. Make a histogram of this data using a class width of 5 mph.
Mini-Lecture 3.2 Measures of Dispersion Objectives 1. Compute the range of a variable from raw data 2. Compute the variance of a variable from raw data 3. Compute the standard deviation of a variable from raw data 4. Use the Empirical Rule to describe data that are bell shaped 5. Use Chebyshev s inequality to describe any set of data Examples 1. Every six months, the United States Federal Reserve Board conducts a survey of credit card plans in the U.S. The following data are the interest rates charged by 10 credit card issuers randomly selected for the January 2008 survey. (Source: http://www.federalreserve.gov/pubs/shop/survey.htm) Institution Rate Pulaski Bank and Trust Company 6.5% Rainier Pacific Savings Bank 12.0% Wells Fargo Bank NA 14.4% Firstbank of Colorado 14.4% Lafayette Ambassador Bank 14.3% Infibank 13.0% United Bank, Inc. 13.3% First National Bank of The Mid-Cities 13.9% Bank of Louisiana 9.9% Bar Harbor Bank and Trust Company 14.5% a. Compute the range, sample variance, and sample standard deviation for the interest rates. Express your answers rounded to the nearest tenth of a percent. (Range = 8.0%, s 2 = 6.7, s = 2.6%) b. On the basis of the histogram drawn in Section 3.1, comment on the appropriateness of using the Empirical Rule to make any general statements about interest rates. (The data do not appear to follow a symmetric or bell-shaped distribution. They appear to be skewed left, so it is not appropriate to use the Empirical Rule to make any general statements about interest rates in the U.S.) c. Does the interest rate for Pulaski Bank and Trust Company appear to be unusual? (Yes)
2. A group of Brigham Young University Idaho students (Matthew Herring, Nathan Spencer, Mark Walker, and Mark Steiner) collected data on the speed of vehicles traveling through a construction zone on a state highway, where the posted (construction) speed limit was 25 mph. The recorded speeds for 14 randomly selected vehicles are given below: 20, 24, 27, 28, 29, 30, 32, 33, 34, 36, 38, 39, 40, 40 a. Determine the sample standard deviation of the speeds. Express your answer rounded to the nearest mile per hour. (s = 6 mph) b. On the basis of the histogram drawn in Section 3.1, comment on the appropriateness of using the Empirical Rule to make any general statements about drivers speeds. (The data appear to roughly follow a symmetric, bell-shaped distribution. It is appropriate to apply the Empirical Rule.) c. Use the Empirical Rule to estimate the percentage of speeds that are between 26 and 38 mph. (Hint: the sample mean is approximately 32 mph.) (68%) d. Determine the actual percentage of drivers whose speeds were between 26 and 38 mph. (64%) e. Use the Empirical Rule to determine the percentage of drivers whose speeds are 26 mph or higher, so they exceed the posted speed limit. (84%) f. Determine the actual percentage of drivers whose speeds are 26 mph or higher, so they exceed the posted speed limit. (86%)
Mini-Lecture 3.3 Measures of Central Tendency and Dispersion from Grouped Data Objectives 1. Approximate the mean of a variable from grouped data 2. Compute the weighted mean 3. Approximate the variance and standard deviation of a variable from grouped data Examples 1. The National Survey of Student Engagement is a survey that (among other things) asked first year students at liberal arts colleges how much time they spend preparing for class each week. The results from the 2007 survey are summarized below. (Source: http://nsse.iub.edu/nsse_2007_annual_report/docs/withhold/ NSSE_2007_Annual_Report.pdf ) Hours 1-5 6-10 11-15 16-20 21-25 26-30 31-35 Percentage 13% 25% 23% 18% 10% 6% 5% a. Approximate the mean and standard deviation for the number of hours students spend preparing for class each week. (14.75 hours; 8.1 hours) b. Draw a relative frequency histogram of the data. What is the shape of the distribution? (Right-skewed) 0.25 0.20 Relative Frequency 0.15 0.10 0.05 0.00 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 Hours Spent Studying
c. Can the Empirical Rule be used with this data? (No, the data are not symmetric) 2. A simple random sample of 89 two-year old Toyota Prius cars that are listed for sale was collected from www.cars.com. The advertised prices of the cars are summarized in the table below. Price 15,000-17,500-20,000-22,500-25,000-27,500-30,000-32,500- (in dollars) 17,499 19,999 22,499 24,999 27,499 29,999 32,499 34,999 Frequency 1 2 4 27 38 15 0 2 a. Find the approximate mean and standard deviation for the advertised prices of the cars. ($25,575.84; $2,711.61) b. Draw a relative frequency histogram of the data. What is the shape of the distribution? (Left skewed, although one could argue the data are symmetric) c. Can the Empirical Rule be used with this data? (No, if the data are deemed to be left skewed; if deemed symmetric, it is appropriate to use the empirical rule)
Mini-Lecture 3.4 Measures of Position Objectives 1. Determine and interpret z-scores 2. Interpret percentiles 3. Determine and interpret quartiles 4. Determine and interpret the interquartile range 5. Check a set of data for outliers Examples 1. The Graduate Record Examination (GRE) is a test required for admission to many U.S. graduate schools. The University of Pittsburgh Graduate School of Public Health requires a GRE score no less than the 70 th percentile for admission into their Human Genetics MPH or MS program. (Source: http://www.publichealth.pitt.edu/interior.php?pageid=101.) Interpret this admissions requirement. (In order to be admitted to this program, an applicant must score as high or higher than 70% of the people who take the GRE.) 2. Every six months, the United States Federal Reserve Board conducts a survey of credit card plans in the U.S. The following data are the interest rates charged by 10 credit card issuers randomly selected for the January 2008 survey. The population mean credit card interest rate in January 2008 was 13.5%, and the population standard deviation was 3.2%. (Source: http://www.federalreserve.gov/pubs/shop/survey.htm) Institution Rate Pulaski Bank and Trust Company 6.5% Rainier Pacific Savings Bank 12.0% Wells Fargo Bank NA 14.4% Firstbank of Colorado 14.4% Lafayette Ambassador Bank 14.3% Infibank 13.0% United Bank, Inc. 13.3% First National Bank of The Mid-Cities 13.9% Bank of Louisiana 9.9% Bar Harbor Bank and Trust Company 14.5% a. Use the population mean and standard deviation to find the z-score corresponding to 6.5%, the rate offered by the Pulaski Bank and Trust Company. (z = 2.19)
b. Use the population mean and standard deviation to find the z-score corresponding to 14.5%, the rate offered by the Bar Harbor Bank and Trust Company. (z = 0.31) c. Find the first quartile of the interest rates. (12.0%) d. Find the interquartile range of the interest rates. (2.4%) 3. A group of Brigham Young University Idaho students (Matthew Herring, Nathan Spencer, Mark Walker, and Mark Steiner) collected data in November 2005 on the speed of vehicles traveling through a construction zone on a state highway, where the posted speed was 25 mph. The recorded speed of 14 randomly selected vehicles is given below: 20, 24, 27, 28, 29, 30, 32, 33, 34, 36, 38, 39, 40, 40. The sample mean and standard deviation for these vehicle speeds are 32.1 and 6.2, respectively. a. Find the z-score for a car driving 20 mph in this construction zone. ( 1.95) b. Find the z-score for a car driving 40 mph in this construction zone. (1.27) c. Determine the quartiles. (Q1 = 28 mph; Q2 = 32.5 mph; Q3 = 38 mph) d. Compute the interquartile range, IQR. (IQR = 10) e. Determine the lower and upper fences. Are there any outliers, according to this criterion? (Lower fence = 13 mph, Upper fence = 53 mph; there are no outliers according to this criterion)
Mini-Lecture 3.5 The Five-Number Summary and Boxplots Objectives 1. Compute the five-number summary 2. Draw and interpret boxplots Examples 1. Every six months, the United States Federal Reserve Board conducts a survey of credit card plans in the U.S. The following data are the interest rates charged by 10 credit card issuers randomly selected for the January 2008 survey. Based on this survey, the mean credit card interest rate in January 2008 was 13.5%, and the standard deviation of the national interest rates was 3.3% (Source: http://www.federalreserve.gov/pubs/shop/survey.htm) Institution Rate Pulaski Bank and Trust Company 6.5% Rainier Pacific Savings Bank 12.0% Wells Fargo Bank NA 14.4% Firstbank of Colorado 14.4% Lafayette Ambassador Bank 14.3% Infibank 13.0% United Bank, Inc. 13.3% First National Bank of The Mid-Cities 13.9% Bank of Louisiana 9.9% Bar Harbor Bank and Trust Company 14.5% a. Find the five-number summary for the observed credit card rates. (Min = 6.5%; Q1 = 12.0%; Median = 13.6%; Q3 = 14.4%; Max = 14.5%) b. Draw a boxplot for the observed credit card rates. c. Determine the shape of the distribution from the boxplot. Refer to the histogram drawn in Section 2.2 to test your answer. (Left-skewed) 2. A group of Brigham Young University Idaho students (Matthew Herring, Nathan Spencer, Mark Walker, and Mark Steiner) collected data in November
2005 on the speed of vehicles traveling through a construction zone on a state highway, where the posted speed was 25 mph. The recorded speed of 14 randomly selected vehicles is given below: 20, 24, 27, 28, 29, 30, 32, 33, 34, 36, 38, 39, 40, 40 a. Find the five-number summary for the observed speeds. (20, 27.5, 32, 38.5, 40) b. Construct a boxplot for the observed speeds. c. Determine the shape of the distribution from the boxplot. Refer to the histogram drawn in Section 2.2 to test your answer. (Symmetric or leftskewed)