Stat 101 Exam 1 - Embers Important Formulas and Concepts 1
|
|
- Gary Jordan
- 5 years ago
- Views:
Transcription
1 1 Chapter Definitions Stat 101 Exam 1 - Embers Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2. Categorical/Qualitative Variables Name categories for grouping. 3. Quantitative Variables When a variable contains measured numerical values with measurement units. 4. Identifier Variable Each record has a unique value like Student ID or SSN. 2 Chapter Definitions 1. Frequency Table Records to totals and uses the category names to label each row. 2. Relative Frequency Table Displays percentages of the values in each category. 3. Bar Chart Displays the distribution of a categorical variable, showing counts for each category next to each other for easy comparison. 4. Relative Frequency Bar Chart Same as a bar chart but displays the percentage of people in each category rather than the counts. 5. Pie Charts Shows a whole group of cases as a circle. The circle is sliced into pieces whose size is proportional to the fraction of the whole in each category. 1 This version: February 7, 2015, by Jennifer Pajda-De La O. May not include all things that could possibly be tested on. To be used as an additional reference to studying all Chapters 1-7. All definitions, formulas, and selected problems come from Intro Stats by De Veaux, Velleman and Bock, 4th edition, published by Pearson.
2 6. Contingency Table Table which shows how the individuals are distributed along each variable. 7. Marginal Distribution Row total or column total in contingency tables. 8. Conditional Distribution Show distribution of one variable for just those cases that satisfy a condition on another variable. Example: Event B given Event A occurs first. 3 Chapter Definitions 1. Distribution Slices up all the possible values of the variable into equal width bins and gives the number of values (or counts) falling into each bin. 2. Histogram Uses adjacent bars to show the distribution of a quantitative variable. Each bar shows the frequency of values falling into each bin. 3. Unimodal Histogram with one peak. 4. Bi-modal Histogram with two peaks. 5. Uniform Histogram that doesn t appear to have any mode. Bars are approximately the same height for each bin. 6. Symmetric Histogram in which the two halves on either side of the center look approximately like mirror images. 7. Skew Histogram that is not symmetric. 8. Skew Left Histogram with a long tail on the left. 9. Skew Right Histogram with a long tail on the right. 10. Boxplot Displays the 5 number summary as a central box with whiskers that etend to the nonoutlying data values.
3 3.2 Formulas 1. Median = Once the data is ordered from smallest to largest, it is the middle value in the data. Divides the histogram into 2 equal pieces. 2. Mean = Average of all of the values = x 3. Range = Max Min 4. Q1 = Median of the lower half of the data 5. Q3 = Median of the upper half of the data 6. IQR = Q3 Q1 7. Upper Fence for Boxplot = Q IQR 8. Lower Fence for Boxplot = Q1 1.5IQR 9. Variance: s 2 = (y ȳ) 2 n Standard Deviation: s = s 2 (y ȳ) 2 = n 1 4 Chapter Definitions 1. z-score Tells how many standard deviations a value is from the mean. Regardless of direction, the farther a data vlaue is from the mean the more unusual it is. 2. Standard Normal Model A Normal Model N(µ, σ) with mean 0 and standard deviation Rule In a normal model, about 68% of values fall within 1 standard deviation of the mean, about 95% fall within 2 standard deviations of the mean; about 99.7% of values fall within 3 standard deviations of the mean. 4. Normal Percentile The normal percentile corresponding to a z-score gives the percentage of values in a standard normal distribution found at that z-score or below. Compared to area under the curve. See Table Z in Appendix D of the textbook.
4 4.2 Formulas 1. z-score: z = y ȳ s = y µ σ 4.3 Properties about the area under a Normal Curve 1. The total area is 100% 2. The mean is the center of a normal curve 3. A Standard Normal Curve has mean = 0 and standard deviation = 1 4. When a normal curve is split in half from the mean, each side contains 50% of the area 5. The normal curve is symmetric 6. If a normal curve is split with 30% of the area on one side, the other side of the curve is 70% of the area 7. If a normal curve has 60% of the area in the middle, the remaining portions are a total of 40%. This 40% is allocated half to each side. So the far left has 20% of the area, the middle is 60% of the area, and the far right side has 20% of the area. Textbook Table Z Appendix D Note: These tables give the percentage to the left of the z value. 5 Chapter Definitions 1. Scatterplot Shows the relationship between 2 quantitative variables. 2. Direction of Scatterplot Positive direction means as one variable increases so does the other. Decreasing direction means the association is negative. 3. Form of Scatterplot Is it in a straight line or some other form? 4. Strength of Scatterplot Strong association if there is little scatter around the underlying relationship. 5. Outlier A point that does not fit the overall pattern seen in the scatterplot.
5 5.2 Formulas 1. Z-Scores for a Scatterplot z x = x x s x z y = y ȳ s y 2. Correlation Coefficient r = = zxzy n 1 [(x x)(y ȳ)] (n 1)s xs y 5.3 Properties of Correlation Coefficient 1. Always between -1 and 1 2. Does not matter which variable you consider as x and y. 3. Treats x and y symmetrically 4. No units 5. Not affected by changes in scale or if the variables are standardized (changing to z- score) 6. Depends only on z-scores 7. Does NOT prove causation. Only provides a relationship between 2 variables. 6 Chapter Definitions 1. Linear Model Equation of the form ŷ = b 0 + b 1 X. ŷ means estimated values for y. 2. Predicted Values Value of ŷ found for a given x-value in the data. 3. Fitted Linear Model ŷ = b 0 + b 1 X 4. Residual Differences between data values and the corresponding values predicted by the model (observed - expected) 5. R 2 Gives the fraction of variability of y accounted for by the least squares linear regression on x. It is an overall measure of how successful the regression is in linearly relating y to x.
6 6. Least Squares Criterion Specifies the unique line that minimizes the variance of the residuals or the sum of squared residuals. 6.2 Formulas 1. Residual = Observed value - Predicted value = y ŷ 2. b 1 = r (s y /s x ) 3. b 0 = ȳ b 1 x 7 Extra Information Review any and all notes and supplementary materials. It may be the case that something was accidentally omitted from this study guide. Also, review any problems that may have been discussed in class as not all example problems may have been provided here.
7 8 Example Problems 1. Use the below table to answer the following questions. Gender Eye Color Blue Green Brown Total Male Female Total (a) Construct a frequency table for Eye Color based on the data above. (b) Find the marginal distribution of gender. (c) What percentage of females have blue eyes? (d) What percentage of green eyed people are male? (e) What percentage of people are females and have green eyes? (f) What percentage of blue eyed people are female? (g) What percentage of males have brown eyes? (h) What percentage of people have brown eyes? (i) What percentage of people are males and have blue eyes? Q40 pg 40 We are investigating whether people taking antidepressants (SSRIs) might be at greater risk of bone fractures. We are given the below contingency table. SSRI no SSRI Total Fractures No Fractures Total (a) What percent of people taking SSRIs have fractures? (b) What percent of people not taking SSRIs have fractures? (c) Is the risk of bone fractures the same among people who were taking SSRIs versus those who were not? 2. In a histogram that is skewed to the right, which is larger, the mean or the median? 3. In a histogram that is skewed or has outliers, which should be reported, the mean or the median? 4. In a histogram that is skewed or has outliers, which should be reported, the IQR or the Standard Deviation? 5. In a histogram that is symmetric with no outliers, which pair of things should be reported, mean with the standard deviation, mean with the IQR, median with the standard deviation, or the median with the IQR?
8 Q23 pg 75 Here are costs of nine compact refrigerators rated very good or excellent by Consumer Reports on their website. 150, 150, 160, 180, 150, 140, 120, 130, 120 Find (a) Mean (b) Median and Quartiles (c) Range and IQR (d) What are the values of the upper fence and the lower fence? (e) Are there any outliers in this data? Why? (f) Find the variance and standard deviation. Q22 pg 103 Shown below are the histogram and summary statistics for the number of camp sites at public parks in Vermont. (a) Which statistics would you use to identify the center and spread of this distribution? Why? (b) How many parks would you classify as outliers? Explain. (c) Create a boxplot for this data.
9 6. Given mean 16 and standard deviation 3. (a) Standardize y = 9 (b) Standardize y = 21 (c) Which of the two above is most unusual? 7. Use the model N(8, 2). (a) What is the mean? (b) What is the standard deviation? (c) What is the variance? (d) Standardize y = Use a normal model with a mean of 50 and standard deviation of 5. (a) Using the model described above, draw the model showing what the Rule predicts. (b) In what interval would you expect the central 99.7% of values to be found? (c) What percent of values are above 50? (d) What percent of values are between 40 and 60? (e) What percent of values are between 40 and 50? (f) What percent of values are between 50 and 60? (g) What percent of values are between 45 and 50? (h) What percent of values are between 50 and 65? (i) What percent of values are above 60? (j) What percent of values are below 45? (k) What percent of values are between 40 and 65? (l) What percent of values are between 45 and 60? 9. Based on the Normal Model N(50, 5), answer the following questions. (a) What percent of values are above 50? (b) What percent of values are above 62? (c) What percent of values are below 39? (d) What percent of values are above 43? (e) What percent of values are below 58? (f) What percent of values are between 37 and 52? (g) What percent of values are betwen and 66?
10 10. Based on the Normal Model N(50, 5), answer the following questions. (a) What cutoff value bounds the highest 5% of values? (b) What cutoff value bounds the lowest 25% of values? (c) What cutoff value bounds the middle 70% of values? 11. Use the given data to answer the following questions. Energy Used (KWH) Price ($) mean = mean = std dev = 6.22 std dev = r = (a) Draw a scatterplot of the above data to study the association between the energy used and the price it cost. (b) What is the direction of the association? (c) What is the form of the relationship? (d) What is the strength of the relationship? (e) Are there any outliers? (f) What is the correlation coefficient? Now consider the regression line for the above data. (a) What is the slope of the regression line? (b) What is the intercept of the regression line? (c) What percent of variation is explained by this model? (d) What does the slope mean in this context? (e) What does the intercept mean in this context? (f) Write down the overall model. (g) What would you predict for the price in the case where there are 18 KWH of energy used? Is this prediction reasonable? (h) The energy company actually charges you $27.50 for 18 KWH of energy used. Is this good? How much would you save or lose compared to what you expected to pay?
11 (i) What would you predict for the price in the case where there are 35 KWH of energy used? Is this prediction reasonable? (j) If the energy used is 3 standard deviations above the mean energy used, what is the predicted standardized value for price? 12. For the below residual plots, decide if a linear model is appropriate. In the case that the linear model is not appropriate, decide which condition is violated (linearity, outlier, or equal spread). Figure 1: Residual Plots (a) Plot 1 (b) Plot 2 (c) Plot 3 (d) Plot 4 (e) Plot 6 (f) Plot 7
12 9 Example Solutions 1. Use the table to answer the questions. (a) Frequency Table: (b) Male: (c) 6 18 = 33.3%. (d) 7 9 = 77.8%. (e) 2 45 = 4.4%. (f) 6 11 = 54.5%. (g) = 55.6%. (h) = 55.6%. (i) 5 45 = 11.1%. Q40 pg 40 (a) 14/137 = 10.2% (b) 244/4871 = 5.01% = 0.6 = 60%. Female: Eye Color Count Blue 11 Green 9 Brown 25 = 0.4 = 40%. (c) Yes, the risk of bone fractures is about twice as high among the people who were taking SSRIs than among those who were not. 2. The mean is larger. 3. The median. 4. IQR. 5. Mean with the standard deviation. Q23 pg 75 (a) Mean = ȳ = 1300/9 = (b) First, order the numbers from lowest to highest. Median = middle value = 150 Q3 = 150 Q1 = 130 (c) IQR = = 20 Range = Max - Min = = 60 (d) Upper Fence = Q IQR = (20) = 180, Lower Fence = Q1 1.5IQR = (20) = 100.
13 (e) There are not outliers because Max Upper Fence and Min Lower Fence. (f) Variance = (y ȳ) 2 n 1 = ( )2 + ( ) 2 + ( ) 2 + ( ) 2 + ( ) 2 + ( = = , Std Dev = V ariance = = Q22 pg 103 (a) Median and IQR because the histogram is skewed right. (b) IQR = = 50 Upper Fence = Q IQR = (50) = = 153 Lower Fence = Q1-1.5IQR = (50) = = 47 The number of parks cannot be negative so the lower fence should be placed at 0. Since max > upper fence, there are outliers. There are probably 4-5 based on the histogram. (c) Note that the histogram below is rounded so if you use exact numbers, it may look slightly different.
14 6. Given mean 16 and standard deviation 3. (a) z = = 7 3 = (b) z = = 5 3 = (c) y = 9 is more unusual. 7. Use the model N(8, 2). (a) Mean = 8. (b) Standard deviation = 2. (c) Variance = 2 2 = 4. (d) z = = 2 2 = 1.
15 8. Use a normal model with a mean of 50 and standard deviation of 5. (a) For the Rule, we will have the following points on the graph (not shown here). µ 3σ = 50 3(5) = 35 µ 2σ = 50 2(5) = 40 µ σ = 50 5 = 45 µ = 50 µ + σ = = 55 µ + 2σ = (5) = 60 µ + 3σ = (5) = 65 (b) Between 35 and 65. (c) 50%. (d) 95%. (e) 47.5%. (f) 47.5%. (g) 34%. (h) 49.85%. (i) 2.5%. (j) 16%. (k) Between 40 and 50 is 47.5%. Between 50 and 65 is 49.85%. Between 40 and 65 is 97.35%. (l) Between 45 and 50 is 34%. Between 50 and 60 is 47.5%. Between 45 and 60 is 81.5%. 9. Based on the Normal Model N(50, 5), answer the following questions. Draw pictures to help you see what is going on. (a) 50%. (b) Step 1: Standardize. z = = 12 = 2.4. Step 2: Calculate value from a 5 5 calculator or a table. From calculator: normalcdf(2.4, 999) = Solution =0.82%. From table: We want area(z > 2.4). The table gives area(z < 2.4) = Our answer is = , which is 0.82%. (c) Step 1: Standardize. z = = 11 = 2.2. Step 2: Calculate value from a 5 5 calculator or a table. From calculator: normalcdf( 999, 2.2) = Solution = 1.39%. From table: We want area(z < 2.2). The table gives us this directly and the value is Solution = 1.39%. (d) Step 1: Standardize. z = = 7 = 1.4. Step 2: Calculate value from a 5 5 calculator or a table. From calculator: normalcdf( 1.4, 999) = Solution = 91.92$. From table: We want area(z > 1.4). The table gives area(z < 1.4) = Our answer is = , which is 91.92%.
16 (e) Step 1: Standardize. z = = 8 = 1.6. Step 2: Calculate value from a 5 5 calculator or a table. From calculator: normalcdf( 999, 1.6) = Solution = 94.52$. From table: We want area(z < 1.6). The table gives us this directly and the value is Solution = 94.52%. (f) Step 1: Standardize both values. z 1 = = 2.6. z 5 2 = = 0.4. Step 2: 5 Calculate value from a calculator or a table. From calculator: normalcdf( 2.6, 0.4) = Solution = 65.08%. From table: We want area( 2.6 < z < 0.4). The table gives area(z < 2.6) = and area(z < 0.4) = Our answer is = , which is 65.07%. The difference between the calculator answer and the table answer is because of rounding. Show your work! (g) Step 1: Standardize both values. z 1 = = z 5 2 = = 3.2. Step 2: 5 Calculate value from a calculator or a table. From calculator: normalcdf(1.45, 3.2) = Solution = 7.28%. From table: We want area(1.45 < z < 3.2). The table gives area(z < 1.45) = and area(z < 3.2) = Our answer is = , which is 7.28%. 10. Based on the Normal Model N(50, 5), answer the following questions. Draw pictures to help you see what is going on. (a) Highest 5% of values corresponds to a z-value of Find this using invnorm(0.95) on your calculator, or looking for the value on a table. Use the z-score formula to solve for the value you are looking for. z = = x = x 50 5 x = (b) Lower 25% of values corresponds to a z-value of Find this using invnorm(0.25) on your calculator, or looking for the value on a table. Use the z-score formula to solve for the value you are looking for. z = 0.67 = x = x 50 5 x = (c) We need 2 values for z in this case. Let the lower value of z be z L and the upper value of z be z R. Find these values by typing invnorm(0.15) and invnorm(0.85) on your calculator. We have z L = z R = 1.04 respectively. Solve for the two values of x. z L = 1.04 = x = x 50 x L = 44.8, z R = 1.04 = x = x 50 x R = Use the data to answer the following. (a) The scatterplot would be as below.
17 Price ($) Price ($) vs Energy Used (KWH) Energy Used (KWH) (b) Direction is positive. (c) The form of the relationship is linear. (d) The strength of the relationship is strong. (e) There are no outliers. (f) The correlation coefficient is Now consider the regression line for the above data. (a) The slope of the regression line is b 1 = r sy s x = ( ) = (b) The intercept of the regression line is b 0 = ȳ b 1 x = (9.125) = (c) The percent of variation explained by this model is R 2 = (0.9687) 2 = = 93.84%. (d) The slope means that as the energy used increases by 1 KWH, the price increases by $ (e) The intercept means that the base price for services, with no energy used, costs $ (f) The overall model is: ŷ = x. (g) You would predict the price to be (18) = $ This prediction is reasonable because we are not extrapolating from the data.
18 (h) The energy company actually charges you $27.50 for 18 KWH of energy used. This is good because we would expect to pay more than this for the energy used. You would save $2.63. Note that when you take = 2.63 you are calculating the residual. (i) You would predict the price to be (35) = $ This prediction is not reasonable and cannot be trusted because we are extrapolating. (j) The predicted standardized value for price is r(sd) = (3) = Classify each of the residual plots. (a) Linear model is not appropriate. The linearity condition is violated. (b) Linear model is appropriate. (c) Linear model is not appropriate. The outlier condition is violated. (d) Linear model is not appropriate. The equal spread condition is violated. (e) Linear model is not appropriate. The equal spread condition is violated. (f) Linear model is not appropriate. The linearity condition is violated.
Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment
Math 2311 Bekki George bekki@math.uh.edu Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment Class webpage: http://www.math.uh.edu/~bekki/math2311.html Math 2311 Class
More informationDATA SUMMARIZATION AND VISUALIZATION
APPENDIX DATA SUMMARIZATION AND VISUALIZATION PART 1 SUMMARIZATION 1: BUILDING BLOCKS OF DATA ANALYSIS 294 PART 2 PART 3 PART 4 VISUALIZATION: GRAPHS AND TABLES FOR SUMMARIZING AND ORGANIZING DATA 296
More informationSTATISTICAL DISTRIBUTIONS AND THE CALCULATOR
STATISTICAL DISTRIBUTIONS AND THE CALCULATOR 1. Basic data sets a. Measures of Center - Mean ( ): average of all values. Characteristic: non-resistant is affected by skew and outliers. - Median: Either
More informationUnit 2 Statistics of One Variable
Unit 2 Statistics of One Variable Day 6 Summarizing Quantitative Data Summarizing Quantitative Data We have discussed how to display quantitative data in a histogram It is useful to be able to describe
More informationappstats5.notebook September 07, 2016 Chapter 5
Chapter 5 Describing Distributions Numerically Chapter 5 Objective: Students will be able to use statistics appropriate to the shape of the data distribution to compare of two or more different data sets.
More information2 Exploring Univariate Data
2 Exploring Univariate Data A good picture is worth more than a thousand words! Having the data collected we examine them to get a feel for they main messages and any surprising features, before attempting
More informationLecture 1: Review and Exploratory Data Analysis (EDA)
Lecture 1: Review and Exploratory Data Analysis (EDA) Ani Manichaikul amanicha@jhsph.edu 16 April 2007 1 / 40 Course Information I Office hours For questions and help When? I ll announce this tomorrow
More informationData that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc.
Chapter 8 Measures of Center Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc. Data that can only be integer
More informationExploratory Data Analysis
Exploratory Data Analysis Stemplots (or Stem-and-leaf plots) Stemplot and Boxplot T -- leading digits are called stems T -- final digits are called leaves STAT 74 Descriptive Statistics 2 Example: (number
More informationUNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences. STAB22H3 Statistics I Duration: 1 hour and 45 minutes
UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences STAB22H3 Statistics I Duration: 1 hour and 45 minutes Last Name: First Name: Student number: Aids allowed: - One handwritten
More information1 Describing Distributions with numbers
1 Describing Distributions with numbers Only for quantitative variables!! 1.1 Describing the center of a data set The mean of a set of numerical observation is the familiar arithmetic average. To write
More informationChapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1
Chapter 3 Numerical Descriptive Measures Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1 Objectives In this chapter, you learn to: Describe the properties of central tendency, variation, and
More informationSOLUTIONS TO THE LAB 1 ASSIGNMENT
SOLUTIONS TO THE LAB 1 ASSIGNMENT Question 1 Excel produces the following histogram of pull strengths for the 100 resistors: 2 20 Histogram of Pull Strengths (lb) Frequency 1 10 0 9 61 63 6 67 69 71 73
More informationLecture 2 Describing Data
Lecture 2 Describing Data Thais Paiva STA 111 - Summer 2013 Term II July 2, 2013 Lecture Plan 1 Types of data 2 Describing the data with plots 3 Summary statistics for central tendency and spread 4 Histograms
More informationAP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE
AP STATISTICS Name: FALL SEMESTSER FINAL EXAM STUDY GUIDE Period: *Go over Vocabulary Notecards! *This is not a comprehensive review you still should look over your past notes, homework/practice, Quizzes,
More informationDescribing Data: One Quantitative Variable
STAT 250 Dr. Kari Lock Morgan The Big Picture Describing Data: One Quantitative Variable Population Sampling SECTIONS 2.2, 2.3 One quantitative variable (2.2, 2.3) Statistical Inference Sample Descriptive
More informationNumerical Descriptions of Data
Numerical Descriptions of Data Measures of Center Mean x = x i n Excel: = average ( ) Weighted mean x = (x i w i ) w i x = data values x i = i th data value w i = weight of the i th data value Median =
More informationSummarising Data. Summarising Data. Examples of Types of Data. Types of Data
Summarising Data Summarising Data Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester Today we will consider Different types of data Appropriate ways to summarise these data 17/10/2017
More informationSTAB22 section 1.3 and Chapter 1 exercises
STAB22 section 1.3 and Chapter 1 exercises 1.101 Go up and down two times the standard deviation from the mean. So 95% of scores will be between 572 (2)(51) = 470 and 572 + (2)(51) = 674. 1.102 Same idea
More informationFrequency Distribution and Summary Statistics
Frequency Distribution and Summary Statistics Dongmei Li Department of Public Health Sciences Office of Public Health Studies University of Hawai i at Mānoa Outline 1. Stemplot 2. Frequency table 3. Summary
More informationMeasures of Center. Mean. 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) Measure of Center. Notation. Mean
Measure of Center Measures of Center The value at the center or middle of a data set 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) 1 2 Mean Notation The measure of center obtained by adding the values
More informationEstablishing a framework for statistical analysis via the Generalized Linear Model
PSY349: Lecture 1: INTRO & CORRELATION Establishing a framework for statistical analysis via the Generalized Linear Model GLM provides a unified framework that incorporates a number of statistical methods
More informationBasic Procedure for Histograms
Basic Procedure for Histograms 1. Compute the range of observations (min. & max. value) 2. Choose an initial # of classes (most likely based on the range of values, try and find a number of classes that
More informationBoth the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need.
Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need. For exams (MD1, MD2, and Final): You may bring one 8.5 by 11 sheet of
More informationMath Take Home Quiz on Chapter 2
Math 116 - Take Home Quiz on Chapter 2 Show the calculations that lead to the answer. Due date: Tuesday June 6th Name Time your class meets Provide an appropriate response. 1) A newspaper surveyed its
More informationGraphical and Tabular Methods in Descriptive Statistics. Descriptive Statistics
Graphical and Tabular Methods in Descriptive Statistics MATH 3342 Section 1.2 Descriptive Statistics n Graphs and Tables n Numerical Summaries Sections 1.3 and 1.4 1 Why graph data? n The amount of data
More informationCategorical. A general name for non-numerical data; the data is separated into categories of some kind.
Chapter 5 Categorical A general name for non-numerical data; the data is separated into categories of some kind. Nominal data Categorical data with no implied order. Eg. Eye colours, favourite TV show,
More informationDescription of Data I
Description of Data I (Summary and Variability measures) Objectives: Able to understand how to summarize the data Able to understand how to measure the variability of the data Able to use and interpret
More informationCHAPTER 2 Describing Data: Numerical
CHAPTER Multiple-Choice Questions 1. A scatter plot can illustrate all of the following except: A) the median of each of the two variables B) the range of each of the two variables C) an indication of
More informationMath 2200 Fall 2014, Exam 1 You may use any calculator. You may not use any cheat sheet.
1 Math 2200 Fall 2014, Exam 1 You may use any calculator. You may not use any cheat sheet. Warning to the Reader! If you are a student for whom this document is a historical artifact, be aware that the
More informationLecture Week 4 Inspecting Data: Distributions
Lecture Week 4 Inspecting Data: Distributions Introduction to Research Methods & Statistics 2013 2014 Hemmo Smit So next week No lecture & workgroups But Practice Test on-line (BB) Enter data for your
More informationGGraph. Males Only. Premium. Experience. GGraph. Gender. 1 0: R 2 Linear = : R 2 Linear = Page 1
GGraph 9 Gender : R Linear =.43 : R Linear =.769 8 7 6 5 4 3 5 5 Males Only GGraph Page R Linear =.43 R Loess 9 8 7 6 5 4 5 5 Explore Case Processing Summary Cases Valid Missing Total N Percent N Percent
More informationWeek 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.
Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics. Convergent validity: the degree to which results/evidence from different tests/sources, converge on the same conclusion.
More informationDescriptive Statistics (Devore Chapter One)
Descriptive Statistics (Devore Chapter One) 1016-345-01 Probability and Statistics for Engineers Winter 2010-2011 Contents 0 Perspective 1 1 Pictorial and Tabular Descriptions of Data 2 1.1 Stem-and-Leaf
More informationDot Plot: A graph for displaying a set of data. Each numerical value is represented by a dot placed above a horizontal number line.
Introduction We continue our study of descriptive statistics with measures of dispersion, such as dot plots, stem and leaf displays, quartiles, percentiles, and box plots. Dot plots, a stem-and-leaf display,
More informationHandout 4 numerical descriptive measures part 2. Example 1. Variance and Standard Deviation for Grouped Data. mf N 535 = = 25
Handout 4 numerical descriptive measures part Calculating Mean for Grouped Data mf Mean for population data: µ mf Mean for sample data: x n where m is the midpoint and f is the frequency of a class. Example
More informationOverview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution
PSY 464 Advanced Experimental Design Describing and Exploring Data The Normal Distribution 1 Overview/Outline Questions-problems? Exploring/Describing data Organizing/summarizing data Graphical presentations
More informationSession 5: Associations
Session 5: Associations Li (Sherlly) Xie http://www.nemoursresearch.org/open/statclass/february2013/ Session 5 Flow 1. Bivariate data visualization Cross-Tab Stacked bar plots Box plot Scatterplot 2. Correlation
More informationIntroduction to Descriptive Statistics
Introduction to Descriptive Statistics 17.871 Types of Variables ~Nominal (Quantitative) Nominal (Qualitative) categorical Ordinal Interval or ratio Describing data Moment Non-mean based measure Center
More informationDATA HANDLING Five-Number Summary
DATA HANDLING Five-Number Summary The five-number summary consists of the minimum and maximum values, the median, and the upper and lower quartiles. The minimum and the maximum are the smallest and greatest
More informationExploring Data and Graphics
Exploring Data and Graphics Rick White Department of Statistics, UBC Graduate Pathways to Success Graduate & Postdoctoral Studies November 13, 2013 Outline Summarizing Data Types of Data Visualizing Data
More informationMATH 217 Test 2 Version A
MATH 217 Test 2 Version A Name: KEY Sec Number: Answer all questions to the best of your ability. Note you should show as much work as is possible. For questions answered using Excel be sure to include
More informationSTAT 113 Variability
STAT 113 Variability Colin Reimer Dawson Oberlin College September 14, 2017 1 / 48 Outline Last Time: Shape and Center Variability Boxplots and the IQR Variance and Standard Deviaton Transformations 2
More informationNOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS
NOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS A box plot is a pictorial representation of the data and can be used to get a good idea and a clear picture about the distribution of the data. It shows
More informationEmpirical Rule (P148)
Interpreting the Standard Deviation Numerical Descriptive Measures for Quantitative data III Dr. Tom Ilvento FREC 408 We can use the standard deviation to express the proportion of cases that might fall
More informationApplications of Data Dispersions
1 Applications of Data Dispersions Key Definitions Standard Deviation: The standard deviation shows how far away each value is from the mean on average. Z-Scores: The distance between the mean and a given
More informationShifting and rescaling data distributions
Shifting and rescaling data distributions It is useful to consider the effect of systematic alterations of all the values in a data set. The simplest such systematic effect is a shift by a fixed constant.
More informationNormal Probability Distributions
C H A P T E R Normal Probability Distributions 5 Section 5.2 Example 3 (pg. 248) Normal Probabilities Assume triglyceride levels of the population of the United States are normally distributed with a mean
More informationConverting to the Standard Normal rv: Exponential PDF and CDF for x 0 Chapter 7: expected value of x
Key Formula Sheet ASU ECN 22 ASWCC Chapter : no key formulas Chapter 2: Relative Frequency=freq of the class/n Approx Class Width: =(largest value-smallest value) /number of classes Chapter 3: sample and
More informationStatistics TI-83 Usage Handout
Statistics TI-83 Usage Handout This handout includes instructions for performing several different functions on a TI-83 calculator for use in Statistics. The Contents table below lists the topics covered
More informationProbability & Statistics Modular Learning Exercises
Probability & Statistics Modular Learning Exercises About The Actuarial Foundation The Actuarial Foundation, a 501(c)(3) nonprofit organization, develops, funds and executes education, scholarship and
More informationMultiple Choice: Identify the choice that best completes the statement or answers the question.
U8: Statistics Review Name: Date: Multiple Choice: Identify the choice that best completes the statement or answers the question. 1. A floral delivery company conducts a study to measure the effect of
More information8. From FRED, search for Canada unemployment and download the unemployment rate for all persons 15 and over, monthly,
Economics 250 Introductory Statistics Exercise 1 Due Tuesday 29 January 2019 in class and on paper Instructions: There is no drop box and this exercise can be submitted only in class. No late submissions
More information3.1 Measures of Central Tendency
3.1 Measures of Central Tendency n Summation Notation x i or x Sum observation on the variable that appears to the right of the summation symbol. Example 1 Suppose the variable x i is used to represent
More informationSection3-2: Measures of Center
Chapter 3 Section3-: Measures of Center Notation Suppose we are making a series of observations, n of them, to be exact. Then we write x 1, x, x 3,K, x n as the values we observe. Thus n is the total number
More informationPercentiles, STATA, Box Plots, Standardizing, and Other Transformations
Percentiles, STATA, Box Plots, Standardizing, and Other Transformations Lecture 3 Reading: Sections 5.7 54 Remember, when you finish a chapter make sure not to miss the last couple of boxes: What Can Go
More informationKING FAHD UNIVERSITY OF PETROLEUM & MINERALS DEPARTMENT OF MATHEMATICAL SCIENCES DHAHRAN, SAUDI ARABIA. Name: ID# Section
KING FAHD UNIVERSITY OF PETROLEUM & MINERALS DEPARTMENT OF MATHEMATICAL SCIENCES DHAHRAN, SAUDI ARABIA STAT 11: BUSINESS STATISTICS I Semester 04 Major Exam #1 Sunday March 7, 005 Please circle your instructor
More informationMath 243 Lecture Notes
Assume the average annual rainfall for in Portland is 36 inches per year with a standard deviation of 9 inches. Also assume that the average wind speed in Chicago is 10 mph with a standard deviation of
More informationTi 83/84. Descriptive Statistics for a List of Numbers
Ti 83/84 Descriptive Statistics for a List of Numbers Quiz scores in a (fictitious) class were 10.5, 13.5, 8, 12, 11.3, 9, 9.5, 5, 15, 2.5, 10.5, 7, 11.5, 10, and 10.5. It s hard to get much of a sense
More informationSome estimates of the height of the podium
Some estimates of the height of the podium 24 36 40 40 40 41 42 44 46 48 50 53 65 98 1 5 number summary Inter quartile range (IQR) range = max min 2 1.5 IQR outlier rule 3 make a boxplot 24 36 40 40 40
More informationDescriptive Analysis
Descriptive Analysis HERTANTO WAHYU SUBAGIO Univariate Analysis Univariate analysis involves the examination across cases of one variable at a time. There are three major characteristics of a single variable
More informationMULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.
Exam Name The bar graph shows the number of tickets sold each week by the garden club for their annual flower show. ) During which week was the most number of tickets sold? ) A) Week B) Week C) Week 5
More informationCenter and Spread. Measures of Center and Spread. Example: Mean. Mean: the balance point 2/22/2009. Describing Distributions with Numbers.
Chapter 3 Section3-: Measures of Center Section 3-3: Measurers of Variation Section 3-4: Measures of Relative Standing Section 3-5: Exploratory Data Analysis Describing Distributions with Numbers The overall
More informationMeasures of Central Tendency Lecture 5 22 February 2006 R. Ryznar
Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar Today s Content Wrap-up from yesterday Frequency Distributions The Mean, Median and Mode Levels of Measurement and Measures of Central
More informationStandardized Data Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis
Descriptive Statistics (Part 2) 4 Chapter Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis McGraw-Hill/Irwin Copyright 2009 by The McGraw-Hill Companies, Inc. Chebyshev s Theorem
More informationChapter 2: Descriptive Statistics. Mean (Arithmetic Mean): Found by adding the data values and dividing the total by the number of data.
-3: Measure of Central Tendency Chapter : Descriptive Statistics The value at the center or middle of a data set. It is a tool for analyzing data. Part 1: Basic concepts of Measures of Center Ex. Data
More informationBusiness Statistics 41000: Probability 4
Business Statistics 41000: Probability 4 Drew D. Creal University of Chicago, Booth School of Business February 14 and 15, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office:
More informationSTAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model
STAT 203 - Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model In Chapter 5, we introduced a few measures of center and spread, and discussed how the mean and standard deviation are good
More informationName: Algebra & 9.4 Midterm Review Sheet January 2019
Name: Algebra 1 9.3 & 9.4 Midterm Review Sheet January 2019 The Midterm format will include 35 Part I multiple choice questions that will be worth 1 point each, 10 Part II short answer questions that will
More informationIntroduction to Computational Finance and Financial Econometrics Descriptive Statistics
You can t see this text! Introduction to Computational Finance and Financial Econometrics Descriptive Statistics Eric Zivot Summer 2015 Eric Zivot (Copyright 2015) Descriptive Statistics 1 / 28 Outline
More informationSTAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model
STAT 203 - Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model In Chapter 5, we introduced a few measures of center and spread, and discussed how the mean and standard deviation are good
More information22.2 Shape, Center, and Spread
Name Class Date 22.2 Shape, Center, and Spread Essential Question: Which measures of center and spread are appropriate for a normal distribution, and which are appropriate for a skewed distribution? Eplore
More informationThe Normal Probability Distribution
1 The Normal Probability Distribution Key Definitions Probability Density Function: An equation used to compute probabilities for continuous random variables where the output value is greater than zero
More informationStat3011: Solution of Midterm Exam One
1 Stat3011: Solution of Midterm Exam One Fall/2003, Tiefeng Jiang Name: Problem 1 (30 points). Choose one appropriate answer in each of the following questions. 1. (B ) The mean age of five people in a
More informationData screening, transformations: MRC05
Dale Berger Data screening, transformations: MRC05 This is a demonstration of data screening and transformations for a regression analysis. Our interest is in predicting current salary from education level
More informationChapter 3. Descriptive Measures. Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 3, Slide 1
Chapter 3 Descriptive Measures Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 3, Slide 1 Chapter 3 Descriptive Measures Mean, Median and Mode Copyright 2016, 2012, 2008 Pearson Education, Inc.
More informationDescriptive Statistics
Petra Petrovics Descriptive Statistics 2 nd seminar DESCRIPTIVE STATISTICS Definition: Descriptive statistics is concerned only with collecting and describing data Methods: - statistical tables and graphs
More informationHow Wealthy Are Europeans?
How Wealthy Are Europeans? Grades: 7, 8, 11, 12 (course specific) Description: Organization of data of to examine measures of spread and measures of central tendency in examination of Gross Domestic Product
More information9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives
Basic Statistics for the Healthcare Professional 1 F R A N K C O H E N, M B B, M P A D I R E C T O R O F A N A L Y T I C S D O C T O R S M A N A G E M E N T, LLC Purpose of Statistic 2 Provide a numerical
More informationChapter 6. y y. Standardizing with z-scores. Standardizing with z-scores (cont.)
Starter Ch. 6: A z-score Analysis Starter Ch. 6 Your Statistics teacher has announced that the lower of your two tests will be dropped. You got a 90 on test 1 and an 85 on test 2. You re all set to drop
More information3) Marital status of each member of a randomly selected group of adults is an example of what type of variable?
MATH112 STATISTICS; REVIEW1 CH1,2,&3 Name CH1 Vocabulary 1) A statistics student wants to find some information about all college students who ride a bike. She collected data from other students in her
More informationThe instructions on this page also work for the TI-83 Plus and the TI-83 Plus Silver Edition.
The instructions on this page also work for the TI-83 Plus and the TI-83 Plus Silver Edition. The position of the graphically represented keys can be found by moving your mouse on top of the graphic. Turn
More information1. In a statistics class with 136 students, the professor records how much money each
so shows the data collected. student has in his or her possession during the first class of the semester. The histogram 1. In a statistics class with 136 students, the professor records how much money
More informationIOP 201-Q (Industrial Psychological Research) Tutorial 5
IOP 201-Q (Industrial Psychological Research) Tutorial 5 TRUE/FALSE [1 point each] Indicate whether the sentence or statement is true or false. 1. To establish a cause-and-effect relation between two variables,
More informationStatistics (This summary is for chapters 17, 28, 29 and section G of chapter 19)
Statistics (This summary is for chapters 17, 28, 29 and section G of chapter 19) Mean, Median, Mode Mode: most common value Median: middle value (when the values are in order) Mean = total how many = x
More informationSimple Descriptive Statistics
Simple Descriptive Statistics These are ways to summarize a data set quickly and accurately The most common way of describing a variable distribution is in terms of two of its properties: Central tendency
More informationCopyright 2011 Pearson Education, Inc. Publishing as Addison-Wesley.
Appendix: Statistics in Action Part I Financial Time Series 1. These data show the effects of stock splits. If you investigate further, you ll find that most of these splits (such as in May 1970) are 3-for-1
More informationSummary of Information from Recapitulation Report Submittals (DR-489 series, DR-493, Central Assessment, Agricultural Schedule):
County: Martin Study Type: 2014 - In-Depth The department approved your preliminary assessment roll for 2014. Roll approval statistical summary reports and graphics for 2014 are attached for additional
More informationGetting to know a data-set (how to approach data) Overview: Descriptives & Graphing
Overview: Descriptives & Graphing 1. Getting to know a data set 2. LOM & types of statistics 3. Descriptive statistics 4. Normal distribution 5. Non-normal distributions 6. Effect of skew on central tendency
More informationDiploma in Financial Management with Public Finance
Diploma in Financial Management with Public Finance Cohort: DFM/09/FT Jan Intake Examinations for 2009 Semester II MODULE: STATISTICS FOR FINANCE MODULE CODE: QUAN 1103 Duration: 2 Hours Reading time:
More informationThe Normal Distribution
Stat 6 Introduction to Business Statistics I Spring 009 Professor: Dr. Petrutza Caragea Section A Tuesdays and Thursdays 9:300:50 a.m. Chapter, Section.3 The Normal Distribution Density Curves So far we
More informationThe Standard Deviation as a Ruler and the Normal Model. Copyright 2009 Pearson Education, Inc.
The Standard Deviation as a Ruler and the Normal Mol Copyright 2009 Pearson Education, Inc. The trick in comparing very different-looking values is to use standard viations as our rulers. The standard
More informationDescriptive Statistics
Chapter 3 Descriptive Statistics Chapter 2 presented graphical techniques for organizing and displaying data. Even though such graphical techniques allow the researcher to make some general observations
More informationNCSS Statistical Software. Reference Intervals
Chapter 586 Introduction A reference interval contains the middle 95% of measurements of a substance from a healthy population. It is a type of prediction interval. This procedure calculates one-, and
More informationStat 328, Summer 2005
Stat 328, Summer 2005 Exam #2, 6/18/05 Name (print) UnivID I have neither given nor received any unauthorized aid in completing this exam. Signed Answer each question completely showing your work where
More informationStatistics for Business and Economics: Random Variables:Continuous
Statistics for Business and Economics: Random Variables:Continuous STT 315: Section 107 Acknowledgement: I d like to thank Dr. Ashoke Sinha for allowing me to use and edit the slides. Murray Bourne (interactive
More informationFundamentals of Statistics
CHAPTER 4 Fundamentals of Statistics Expected Outcomes Know the difference between a variable and an attribute. Perform mathematical calculations to the correct number of significant figures. Construct
More informationChapter 14. Descriptive Methods in Regression and Correlation. Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 14, Slide 1
Chapter 14 Descriptive Methods in Regression and Correlation Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 14, Slide 1 Section 14.1 Linear Equations with One Independent Variable Copyright
More informationNOTES: Chapter 4 Describing Data
NOTES: Chapter 4 Describing Data Intro to Statistics COLYER Spring 2017 Student Name: Page 2 Section 4.1 ~ What is Average? Objective: In this section you will understand the difference between the three
More informationStatistics (This summary is for chapters 18, 29 and section H of chapter 19)
Statistics (This summary is for chapters 18, 29 and section H of chapter 19) Mean, Median, Mode Mode: most common value Median: middle value (when the values are in order) Mean = total how many = x n =
More information