MATHEMATICS APPLIED TO BIOLOGICAL SCIENCES MVE PA 07. LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1)
|
|
- Arthur Hodges
- 5 years ago
- Views:
Transcription
1 LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1) Descriptive statistics are ways of summarizing large sets of quantitative (numerical) information. The best way to reduce a set of data and still retain its information is to summarize it with a single value. These values are known with the name synthetic indicators. With synthetic indicators are evaluate: 1. Central tendency 2. Measures of spread 3. Measures of shape Measures of central tendency Central tendency refers to the idea that there is one number that best summarizes the entire set of measurements, a number that is in some way "central" to the set. Measures of central tendency mean, median, and mode can help you capture, with a single number, what is typical of a data set. The mean is the average value of all the data in the set. The median is the middle value in a data set that has been arranged in numerical order so that exactly half the data is above the median and half is below it. The mode is the value that occurs most frequently in the set. In a normal distribution, mean, median and mode are identical in value. When not to use the mean. The mean has one main disadvantage: it is particularly susceptible to the influence of outliers. Statistical outliers are data points that are far removed and numerically distant from the rest of the points. These are values that are unusual compared to the rest of the data set by being especially small or large in numerical value. When is the mean the best measure of central tendency? The mean is usually the best measure of central tendency to use when your data distribution is continuous and symmetrical, such as when your data is normally distributed. If a Histogram of a series of data has a shape often referred to as a "bell curve." this series of data has a normal distribution. However, it all depends on what you are trying to show from your data. When is the mode the best measure of central tendency? The mode is the least used of the measures of central tendency and can only be used when dealing with nominal data. For this reason, the mode will be the best measure of central tendency (as it is the only one appropriate to use) when dealing with nominal data. The mean and/or median are usually preferred when dealing with all other types of data, but this does not mean it is never used with these data types. When is the median the best measure of central tendency? The median is usually preferred to other measures of central tendency when your data set is skewed (i.e., forms a skewed distribution) or you are dealing with ordinal data. However, the mode can also be appropriate in these situations, but is not as commonly used as the median. Measures of spread or dispersion Dispersion refers to the idea that there is a second number which tells us how "spread out" all the measurements are from that central number. Measures of spread include range, quartiles, variance and standard deviation.
2 Range The range is very easy to calculate because it is simply the difference between the largest and the smallest observed values in a data set. Thus, range, including any outliers, is the actual spread of data. Range = difference between highest and lowest observed values Quartiles The median divides the data into two equal sets. The lower quartile is the value of the middle of the first set, where 25% of the values are smaller than Q1 and 75% are larger. This first quartile takes the notation Q1. The upper quartile is the value of the middle of the second set, where 75% of the values are smaller than Q3 and 25% are larger. This third quartile takes the notation Q3. Variance (S 2 ) = average squared deviation of values from mean The variance for a discrete variable made up of n observations is defined as: Standard deviation (S) = square root of the variance The standard deviation is the "average" degree to which scores deviate from the mean. The standard deviation for a discrete variable made up of n observations is the positive square root of the variance and is defined as: Measures of shape The histogram can give you a general idea of the shape, but two numerical measures of shape give a more precise evaluation: skewness tells you the amount and direction of skew (departure from horizontal symmetry), As we will see, data can be skewed either to the right or to the left. Kurtosis tells you how tall and sharp the central peak is, relative to a standard bell curve. Kurtosis is the measure of the peak of a distribution, and indicates how high the distribution is around the mean. Objectives: 1. Calculating the synthetic indicators using formulas 2. Interpretation of indicators of centrality spread and shape 3. Calculating the synthetic indicators as a result of a call to the statistical functions in Excel 4. Calculating the synthetic indicators as a result of a call to Descriptive statistics option of Data Analyses
3 Problema 1 The table below shows the concentrations of cholesterol in the blood, as measured in mg / dl for the sample of 50 patients Calculating the mean, median and modal value with: a. Formulas, b. Excel functions c. Option Descriptive statistics of Data Analyses - Calculating the range, variance, standard deviation, coefficient of variation with: a. Formulas, b. Excel functions c. Option Descriptive statistics of Data Analyses - Calculating skewness and kurtosis - Determine the quartiles Q1, Q2 and Q3 which divide the dataset into four sections with equal number of values. - Plotting box plot diagram Ind: To calculate synthetic indicators based on calculation formulas or using the functions implemented in EXCEL it is necessary to enter the values observed on a column in the spreadsheet a. Calculation based on formulas or definition: - For calculating the average, will sum with SUM function or AotoSum 50 values. - Enter the calculation formula for dividing the sum by If the series of observations is given in the form of frequency distribution, will take into account the frequency of occurrence of each value for average value calculation (m = x in i n i ) - The median value (Me) of a statistical series that divides the ordered sequence of variable values in two parts, each part containing the same number of values. To find this value proceed as defined: - will write to the values of one column (in Excel) - will sort these values in ascending order - if are many values (50 in our case), even number, 25th and 26th values are 200 or 200, so in this case Me = 200 (average of the 25th to 26th respectively in the string value of observations). - If the number of variables were odd, for example 51, the median value would have been the 26th of observations ordered in ascending value - To calculate the modal value is necessary to obtain the frequency distribution, the modal value is the value of variable that has the highest frequency of occurrence. - To calculate the amplitude is finding the maximum and minimum value required. They can be ordered ascending series of values. 2 ( xi m) 2 s - For the calculation of variance using the formula, N it is necessary calculation of m (arithmetic mean), then you use the Fill Down and Auto Sum.
4 - Standard deviation as the square root of the variance caculează - The coefficient of variance is calculated as: CV [%] = standard deviation aritmetic mean 100 b. To calculate requirements implemented using Excel functions will proceed as follows: - The column will be written as labels, the requirements problem. - In their own right, select the next cell and call functions that calculate requirements: AVERAGE, MEDIAN, MODE, VAR, STDEV, QUARTILE, SKEW, KURT, and so on, - All these functions have the argument the spreadsheet that contains the table of observations (table primary evidence or actual table) average =AVERAGE(A1:A50) 198,7 median =MEDIAN(A1:A50) 200 mode =MODE(A1:A50) 180 variance =VAR(A1:A50) 1319, Standard deviation =STDEV(A1:A50) 36, coefficient of variation =(B67/B63) % 0, skewness =SKEW(A1:A50) -1, kurtosis =KURT(A1:A50) 0, Q1 =QUARTILE(A1:A50,1) 170 Q2 =QUARTILE(A1:A50,2) 200 Q3 =QUARTILE(A1:A50,3) 230 c. All indicators can be calculated by simply calling the function Descriptive Statistics of Data Analyses (Data menu). Installing the Data Analysis option has been presented in previous work. From the dialog that appears, select Descriptive Statistics. Click on OK. You will see the results 1. In the input section, enter the data range in Input Range. (B2:B13 here.) 2. In the output section, enter the cells where you want to save the result. [Here the results are put in the same worksheet.] 3. Select Summary Statistics. 4. Then click on OK Mean 198,7 Standard Error 5, Median 200 Mode 180 Standard Deviation 36,32071 Sample Variance 1319,194
5 Kurtosis -1,16536 Skewness 0, Range 120 Minimum 140 Maximum 260 Sum 9935 Count 50 Confidence Level(95,0%) 10, Mean - arithmetic average. Can be calculated and the AVERAGE function. If the variable is normally distributed, the average interval indicates the middle of the minimum and maximum (range distribution data). Also in case of normal distribution around the mean (ie average within-standard deviation, mean + standard deviation are the most data. - Standard Error - Standard error. The standard error is involved in the calculation of 95% confidence interval about the mean ( only one variable regular distribution ), is also involved in the statistical inference. - Median - The median is the value of the series so that half of the observations have values less than ( or equal to ) and the other half have values greater than (or equal). It can calculate and MEDIAN function. If the normal distribution mean and median are equal. So the arithmetic mean median and indicators are normal distribution, how have values closer the more likely that the variable is normally distributed. The term " closest " is estimated by the size of the standard error. - Mode - The module is the value that has the highest frequency in the series. If a situation arises module is where the series is not the way, that all the values appear only once. When displayed value # N / A. Another possible situation is that the series is bimodal or tri-modal. Then only the first value will be displayed in order of their appearance in the series. In this case to determine all values module can make a frequency table. It can calculate and MODE function. The module is useful in the case of qualitative variables ordered, but in the case of other variables, for example, in the case of continuous variable with normal distribution module is likely to have a value apopiată medium. - Standard Deviation - The standard deviation can be calculated with the function STDEV and the population standard deviation with the function STDEVP. The standard deviation shows the mean square deviation is the arithmetic variable. If you have a small value, then the data varies slightly around the average. If the distribution is represented by Gauss parameters measuring central tendency (mean, median, modal value and the central value) have the same values. In this case (normal distribution) distributions occur following data: - Interval X 1 s contains about 68.3% of the observations - Interval X 2 s contains about 95.5% of the observations - Interval X 3 s contains about 99.7% of the observations - Sample Variance - Variation can be calculated with VAR or VARP function - The Skewness measure indicates the level of non-symmetry. If the distribution of the data are symmetric then skewness will be close to 0 (zero). The further from 0, the more skewed the data. A negative value indicates a skew to the left. How do you tell if the skewness is large enough to case concern. Excel doesn t give you this value, but a measure of the standard error of skewness can be calculated as =SQRT(6/N) or =SQRT(6/50) which is If the skewness is more than twice this amount, then it indicates that the distribution of the data is non-symmetric. In this case * 2 = The skewness reported by Excel is , so the data can be assumed to be fairly symmetric (although somewhat marginally so.) However, this does NOT indicate that the data are normally distributed. - Kurtosis is a measure of the peakedness of the data. Again, for normally distributed data the kurtosis is 0 (zero). As with skewness, if the value of kurtosis is too big or too small, there is concern about the normality of the distribution. In this case, a rough formula for the standard error for kurtosis is =SQRT(24/N) = Twice this amount is Since the value of kurtosis falls within two standard errors (-0.26) the data may be
6 considered to meet the criteria for normality by this measure. These measures of skewness and kurtosis are one method of examining the distribution of the data. However, they are not definitive in concluding normality. You should also examine a graph (histogram) of the data and consider performing other tests for normality such as the Shapiro-Wilk or Kolmogorov-Smirnov test (not provided by Excel). - Range - amplitude is maximum-minimum difference data series. - Minimum - The minimum value of the smallest in the series. It can calculate with MIN function - Maximum - The maximum amount of the highest in the series. It can calculate with MAX function - Sum - The sum or total values of the series. It can be calculated with SUM function. - Count - The number of observations or sample size n = 20. Can be calculated with COUNT function. - Quartiles and percentiles are similar median. The first quartile or value having the property that 25% of the data series are less than or equal to it, and 75% higher or equal to the first quartile. The second quartile is the median. The third quartile is a value having the property that 75% of the data series are less than or equal to it and 25% higher or equal to the third quartile. - Percentile in the order of a value is such that an amount of data is equal to or less while the other is larger. - CV=STDEVP/AVERAGE - Coefficient of variation: you can use the following rules of thumb for interpretation: - If CV is below 10% then the population can be considered homogeneous; - if CV is between 10% -20% if the population can be considered relatively homogeneous; - if CV is between 20% -30% if the population can be considered relatively heterogeneous; - if CV is above 30% then the population can be considered heterogeneous. For plotting box plot diagram recommend using the program online at: Problema 2. Is provided a study on a sample size of 20 patients. We collect data on the following parameters Biomedical: diastolic blood pressure (DBP) (mmhg), systolic blood pressure (SBP) (mmhg), age (days), height (cm), weight (grams), sex. Data are presented in Table Excel file SD. Save the file in your folder and make the following statistical processing this file. Requirements It aims to calculate - Indicators of central tendency (arithmetic mean, median, mode, geometric mean, harmonic mean, the central value) - Dispersion (amplitude, mean deviation, variance, standard deviation, coefficient of variation, standard error, asymmetry, flatness) and - Location (first quartile (minimum), second quartile, third quartile (median), the fourth quartile, the fifth quartile (maximum) for quantitative variables. - Insert table on Sheet1 Excel file Table SD. Calculate the required indicators in this table. To calculate statistical indicators used Excel functions or formulas - see instructions. Interpret the results statistically (based on the arithmetic mean, median, boltirii, asymmetry, standard error, standard deviation, minimum and maximum variables assess if the distribution is Gaussian normal distribution). Indicators of central tendency Denumire SBP DBP Age heigh t Aritmetic mean (AVERAGE) Median (MEDIAN) Mode (MODE) Geometric mean (GEOMEAN) wieg ht
7 Indicators of dispersion Indicators shape and location Armonic mean (HARMEAN) Range (RANGE) Standard deviation (STDEV) Variance (VAR) Coeficientul de variaţie (with formula!!!!) m-s INTERVALS - m represente arithmetic mean - s represent standar deviation Eroarea standard 1 m+s m-2s m+2s m-3s m+3s Skewness (SKEW) Kurtosis (KURT) First quartiel (min) (QUARTILE) Second quartile The third quartile (mediana) (QUARTILE) The fourth quartile (QUARTILE) A fifth quartile (maxim) (QUARTILE). 1. Calculate descriptive statistics (using Data Analysis) for the first 6 variables and interpret the results. 2. Format results: reduce the number of decimal places (two decimal) SBP (mmhg) DBP (mmhg) Age (zile) Height (cm) Weight (grame) Mean 143,68 76,05 47,42 54, ,32 Standard Error 5,20 3,78 4,17 0,74 165,94 Median Mode Standard Deviation 22,66 16,46 18,17 3,22 723,30 Sample Variance 513,45 271,05 330,15 10, ,89 Kurtosis -0,75-0,93-0,66-0,20 0,15 1 To calculate the standard error of the mean by itself, use the one of the following formulas = STANDARD DEVIATION / SQUARE ROOT OF THE POPULATION SIZE -or- = STDEV(range of values)/sqrt(number) where: The range of values is the data used to determine the standard deviation. -and- The number is the size of all the possible random samples.
8 Skewness -0,12 0,02 0,42 0,28-0,44 Range Minimum Maximum Sum Count Confidence Level(95,0%) 10,92 7,94 8,76 1,55 348,62 Problema 3 Age at death to a total of 60 dogs are shown below. 11,8; 3,6; 16,6; 13,5; 4,8; 8,3; 8,9; 9,1; 7,7; 2,3; 12,1; 6,1; 10,2; 8; 11,4; 6,8; 9,6; 19,5; 15,3; 12,3; 8,5; 15,9; 18,7; 11,7; 6,2; 11,2; 10,4; 7,2; 5,5; 14,5; 16,6; 13,5; 4,8; 8,3; 8,9; 9,1; 7,7; 2,3; 12,1; 6,1; 9,6; 19,5; 15,3; 12,3; 8,5; 16,6; 13,5; 4,8; 8,3; 8,9; 9,1; 7,7; 2,3; 12,1; 6,1; 9,6; 19,5; 15,3; 12,3; 8,5 - Calculate the mean, median and modal value. - Calculate the amplitude, variance, standard deviation, coefficient of variation. - Determine the quartiles Q1, Q2 and Q3 which divide the dataset into four sections with equal number of values. - Prepare frequency distribution, and perform histogram - Calculate the index of asymmetry and vaulting. Problema 5 The following data are the age (in years) of a virus infection has been diagnosed with a sample of 27 randomly selected cases: 39, 50, 26, 45, 71, 51, 33, 40, 40, 51, 66, 63, 55, 36, 57, 41, 61, 47, 44, 48, 59,42, 54, 47, 53, 54, Calculate accurate to two decimal places following statistics: - media - median - module - the amplitude - version - Standard deviation - Coefficient of variation 2. How many points are summarized in the following ranges: - X ± 1. s - X ± 2. s - X ± 3. s 3. Using your coefficient of variation, specify the uniformity of the series of observed values
9 Probleme propuse (opțional) 1. Consider the following series of frequency distribution: xi fa Calculations three parameters of central tendency; Determine the variance and standard deviation of the data set. What is the coefficient of variation? 2. A die was thrown 45 times. The frequency of numbers obtained are shown in the table below: Scor Frecvenţă What is the average score obtained? 3 The temperature was measured at ground level within 24 hours. The result is the following graph: What is the amplitude? a 10 F b 30 F c 40 F d 50 F 4. What is the amplitude for the next range of values? 57, -5, 11, 39, 56, 82, -2, 11, 64, 18, 37, 15, 68 a 11 b 68 c 77 d The table below shows the monthly average in 2013 Month Ian. Feb. Mar. Apr. Mai Iun. Iul. Aug. Sept. Oct. Noi. Dec. Temperatura medie ( C) Calculate the average annual temperature 2. Compute median 3. Compute the dominant value
DESCRIPTIVE STATISTICS II. Sorana D. Bolboacă
DESCRIPTIVE STATISTICS II Sorana D. Bolboacă OUTLINE Measures of centrality Measures of spread Measures of symmetry Measures of localization Mainly applied on quantitative variables 2 DESCRIPTIVE STATISTICS
More informationNumerical Descriptive Measures. Measures of Center: Mean and Median
Steve Sawin Statistics Numerical Descriptive Measures Having seen the shape of a distribution by looking at the histogram, the two most obvious questions to ask about the specific distribution is where
More informationDescriptive Statistics
Chapter 3 Descriptive Statistics Chapter 2 presented graphical techniques for organizing and displaying data. Even though such graphical techniques allow the researcher to make some general observations
More informationNumerical Descriptions of Data
Numerical Descriptions of Data Measures of Center Mean x = x i n Excel: = average ( ) Weighted mean x = (x i w i ) w i x = data values x i = i th data value w i = weight of the i th data value Median =
More informationDescriptive Analysis
Descriptive Analysis HERTANTO WAHYU SUBAGIO Univariate Analysis Univariate analysis involves the examination across cases of one variable at a time. There are three major characteristics of a single variable
More informationMBEJ 1023 Dr. Mehdi Moeinaddini Dept. of Urban & Regional Planning Faculty of Built Environment
MBEJ 1023 Planning Analytical Methods Dr. Mehdi Moeinaddini Dept. of Urban & Regional Planning Faculty of Built Environment Contents What is statistics? Population and Sample Descriptive Statistics Inferential
More informationDavid Tenenbaum GEOG 090 UNC-CH Spring 2005
Simple Descriptive Statistics Review and Examples You will likely make use of all three measures of central tendency (mode, median, and mean), as well as some key measures of dispersion (standard deviation,
More informationDescriptive Statistics
Petra Petrovics Descriptive Statistics 2 nd seminar DESCRIPTIVE STATISTICS Definition: Descriptive statistics is concerned only with collecting and describing data Methods: - statistical tables and graphs
More informationMeasures of Center. Mean. 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) Measure of Center. Notation. Mean
Measure of Center Measures of Center The value at the center or middle of a data set 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) 1 2 Mean Notation The measure of center obtained by adding the values
More informationBasic Procedure for Histograms
Basic Procedure for Histograms 1. Compute the range of observations (min. & max. value) 2. Choose an initial # of classes (most likely based on the range of values, try and find a number of classes that
More information9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives
Basic Statistics for the Healthcare Professional 1 F R A N K C O H E N, M B B, M P A D I R E C T O R O F A N A L Y T I C S D O C T O R S M A N A G E M E N T, LLC Purpose of Statistic 2 Provide a numerical
More informationMeasures of Dispersion (Range, standard deviation, standard error) Introduction
Measures of Dispersion (Range, standard deviation, standard error) Introduction We have already learnt that frequency distribution table gives a rough idea of the distribution of the variables in a sample
More informationCHAPTER 2 Describing Data: Numerical
CHAPTER Multiple-Choice Questions 1. A scatter plot can illustrate all of the following except: A) the median of each of the two variables B) the range of each of the two variables C) an indication of
More informationFundamentals of Statistics
CHAPTER 4 Fundamentals of Statistics Expected Outcomes Know the difference between a variable and an attribute. Perform mathematical calculations to the correct number of significant figures. Construct
More informationPSYCHOLOGICAL STATISTICS
UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc COUNSELLING PSYCHOLOGY (2011 Admission Onwards) II Semester Complementary Course PSYCHOLOGICAL STATISTICS QUESTION BANK 1. The process of grouping
More informationChapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1
Chapter 3 Numerical Descriptive Measures Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1 Objectives In this chapter, you learn to: Describe the properties of central tendency, variation, and
More informationPutting Things Together Part 2
Frequency Putting Things Together Part These exercise blend ideas from various graphs (histograms and boxplots), differing shapes of distributions, and values summarizing the data. Data for, and are in
More informationWeek 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.
Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics. Convergent validity: the degree to which results/evidence from different tests/sources, converge on the same conclusion.
More information3.1 Measures of Central Tendency
3.1 Measures of Central Tendency n Summation Notation x i or x Sum observation on the variable that appears to the right of the summation symbol. Example 1 Suppose the variable x i is used to represent
More information1.2 Describing Distributions with Numbers, Continued
1.2 Describing Distributions with Numbers, Continued Ulrich Hoensch Thursday, September 6, 2012 Interquartile Range and 1.5 IQR Rule for Outliers The interquartile range IQR is the distance between the
More informationSimple Descriptive Statistics
Simple Descriptive Statistics These are ways to summarize a data set quickly and accurately The most common way of describing a variable distribution is in terms of two of its properties: Central tendency
More informationSome Characteristics of Data
Some Characteristics of Data Not all data is the same, and depending on some characteristics of a particular dataset, there are some limitations as to what can and cannot be done with that data. Some key
More informationDescription of Data I
Description of Data I (Summary and Variability measures) Objectives: Able to understand how to summarize the data Able to understand how to measure the variability of the data Able to use and interpret
More informationSummarising Data. Summarising Data. Examples of Types of Data. Types of Data
Summarising Data Summarising Data Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester Today we will consider Different types of data Appropriate ways to summarise these data 17/10/2017
More informationNOTES: Chapter 4 Describing Data
NOTES: Chapter 4 Describing Data Intro to Statistics COLYER Spring 2017 Student Name: Page 2 Section 4.1 ~ What is Average? Objective: In this section you will understand the difference between the three
More informationappstats5.notebook September 07, 2016 Chapter 5
Chapter 5 Describing Distributions Numerically Chapter 5 Objective: Students will be able to use statistics appropriate to the shape of the data distribution to compare of two or more different data sets.
More informationLectures delivered by Prof.K.K.Achary, YRC
Lectures delivered by Prof.K.K.Achary, YRC Given a data set, we say that it is symmetric about a central value if the observations are distributed symmetrically about the central value. In symmetrically
More informationLecture 2 Describing Data
Lecture 2 Describing Data Thais Paiva STA 111 - Summer 2013 Term II July 2, 2013 Lecture Plan 1 Types of data 2 Describing the data with plots 3 Summary statistics for central tendency and spread 4 Histograms
More informationDazStat. Introduction. Installation. DazStat is an Excel add-in for Excel 2003 and Excel 2007.
DazStat Introduction DazStat is an Excel add-in for Excel 2003 and Excel 2007. DazStat is one of a series of Daz add-ins that are planned to provide increasingly sophisticated analytical functions particularly
More information1 Exercise One. 1.1 Calculate the mean ROI. Note that the data is not grouped! Below you find the raw data in tabular form:
1 Exercise One Note that the data is not grouped! 1.1 Calculate the mean ROI Below you find the raw data in tabular form: Obs Data 1 18.5 2 18.6 3 17.4 4 12.2 5 19.7 6 5.6 7 7.7 8 9.8 9 19.9 10 9.9 11
More informationCopyright 2005 Pearson Education, Inc. Slide 6-1
Copyright 2005 Pearson Education, Inc. Slide 6-1 Chapter 6 Copyright 2005 Pearson Education, Inc. Measures of Center in a Distribution 6-A The mean is what we most commonly call the average value. It is
More informationDATA SUMMARIZATION AND VISUALIZATION
APPENDIX DATA SUMMARIZATION AND VISUALIZATION PART 1 SUMMARIZATION 1: BUILDING BLOCKS OF DATA ANALYSIS 294 PART 2 PART 3 PART 4 VISUALIZATION: GRAPHS AND TABLES FOR SUMMARIZING AND ORGANIZING DATA 296
More informationData Distributions and Normality
Data Distributions and Normality Definition (Non)Parametric Parametric statistics assume that data come from a normal distribution, and make inferences about parameters of that distribution. These statistical
More informationLAB 2 INSTRUCTIONS PROBABILITY DISTRIBUTIONS IN EXCEL
LAB 2 INSTRUCTIONS PROBABILITY DISTRIBUTIONS IN EXCEL There is a wide range of probability distributions (both discrete and continuous) available in Excel. They can be accessed through the Insert Function
More information1 Describing Distributions with numbers
1 Describing Distributions with numbers Only for quantitative variables!! 1.1 Describing the center of a data set The mean of a set of numerical observation is the familiar arithmetic average. To write
More informationMath 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment
Math 2311 Bekki George bekki@math.uh.edu Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment Class webpage: http://www.math.uh.edu/~bekki/math2311.html Math 2311 Class
More informationOverview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution
PSY 464 Advanced Experimental Design Describing and Exploring Data The Normal Distribution 1 Overview/Outline Questions-problems? Exploring/Describing data Organizing/summarizing data Graphical presentations
More informationStandardized Data Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis
Descriptive Statistics (Part 2) 4 Chapter Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis McGraw-Hill/Irwin Copyright 2009 by The McGraw-Hill Companies, Inc. Chebyshev s Theorem
More informationSTAB22 section 1.3 and Chapter 1 exercises
STAB22 section 1.3 and Chapter 1 exercises 1.101 Go up and down two times the standard deviation from the mean. So 95% of scores will be between 572 (2)(51) = 470 and 572 + (2)(51) = 674. 1.102 Same idea
More informationThe Normal Distribution & Descriptive Statistics. Kin 304W Week 2: Jan 15, 2012
The Normal Distribution & Descriptive Statistics Kin 304W Week 2: Jan 15, 2012 1 Questionnaire Results I received 71 completed questionnaires. Thank you! Are you nervous about scientific writing? You re
More informationPOPULATION SAMPLE GROUP INDIVIDUAL ORDINAL DATA NOMINAL DATA
NUMERICAL DATA POPULATION ALPHANUMERIC DATA INDIVIDUAL ORDINAL DATA SAMPLE GROUP NOMINAL DATA Lecture 2 - Statistical indicators Minimum, maximum Mean or Average Standard deviation Median Quartiles Percentiles,
More informationMonte Carlo Simulation (Random Number Generation)
Monte Carlo Simulation (Random Number Generation) Revised: 10/11/2017 Summary... 1 Data Input... 1 Analysis Options... 6 Summary Statistics... 6 Box-and-Whisker Plots... 7 Percentiles... 9 Quantile Plots...
More informationLecture Week 4 Inspecting Data: Distributions
Lecture Week 4 Inspecting Data: Distributions Introduction to Research Methods & Statistics 2013 2014 Hemmo Smit So next week No lecture & workgroups But Practice Test on-line (BB) Enter data for your
More informationNCSS Statistical Software. Reference Intervals
Chapter 586 Introduction A reference interval contains the middle 95% of measurements of a substance from a healthy population. It is a type of prediction interval. This procedure calculates one-, and
More informationLecture 1: Review and Exploratory Data Analysis (EDA)
Lecture 1: Review and Exploratory Data Analysis (EDA) Ani Manichaikul amanicha@jhsph.edu 16 April 2007 1 / 40 Course Information I Office hours For questions and help When? I ll announce this tomorrow
More informationChapter 3 Descriptive Statistics: Numerical Measures Part A
Slides Prepared by JOHN S. LOUCKS St. Edward s University Slide 1 Chapter 3 Descriptive Statistics: Numerical Measures Part A Measures of Location Measures of Variability Slide Measures of Location Mean
More information2 Exploring Univariate Data
2 Exploring Univariate Data A good picture is worth more than a thousand words! Having the data collected we examine them to get a feel for they main messages and any surprising features, before attempting
More informationMEASURES OF DISPERSION, RELATIVE STANDING AND SHAPE. Dr. Bijaya Bhusan Nanda,
MEASURES OF DISPERSION, RELATIVE STANDING AND SHAPE Dr. Bijaya Bhusan Nanda, CONTENTS What is measures of dispersion? Why measures of dispersion? How measures of dispersions are calculated? Range Quartile
More informationDescribing Data: One Quantitative Variable
STAT 250 Dr. Kari Lock Morgan The Big Picture Describing Data: One Quantitative Variable Population Sampling SECTIONS 2.2, 2.3 One quantitative variable (2.2, 2.3) Statistical Inference Sample Descriptive
More informationBoth the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need.
Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need. For exams (MD1, MD2, and Final): You may bring one 8.5 by 11 sheet of
More informationBiostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras
Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Lecture - 05 Normal Distribution So far we have looked at discrete distributions
More informationCategorical. A general name for non-numerical data; the data is separated into categories of some kind.
Chapter 5 Categorical A general name for non-numerical data; the data is separated into categories of some kind. Nominal data Categorical data with no implied order. Eg. Eye colours, favourite TV show,
More informationEngineering Mathematics III. Moments
Moments Mean and median Mean value (centre of gravity) f(x) x f (x) x dx Median value (50th percentile) F(x med ) 1 2 P(x x med ) P(x x med ) 1 0 F(x) x med 1/2 x x Variance and standard deviation
More informationDot Plot: A graph for displaying a set of data. Each numerical value is represented by a dot placed above a horizontal number line.
Introduction We continue our study of descriptive statistics with measures of dispersion, such as dot plots, stem and leaf displays, quartiles, percentiles, and box plots. Dot plots, a stem-and-leaf display,
More informationData that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc.
Chapter 8 Measures of Center Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc. Data that can only be integer
More informationUnit 2 Statistics of One Variable
Unit 2 Statistics of One Variable Day 6 Summarizing Quantitative Data Summarizing Quantitative Data We have discussed how to display quantitative data in a histogram It is useful to be able to describe
More informationSummary of Statistical Analysis Tools EDAD 5630
Summary of Statistical Analysis Tools EDAD 5630 Test Name Program Used Purpose Steps Main Uses/Applications in Schools Principal Component Analysis SPSS Measure Underlying Constructs Reliability SPSS Measure
More informationStatistics I Chapter 2: Analysis of univariate data
Statistics I Chapter 2: Analysis of univariate data Numerical summary Central tendency Location Spread Form mean quartiles range coeff. asymmetry median percentiles interquartile range coeff. kurtosis
More informationA LEVEL MATHEMATICS ANSWERS AND MARKSCHEMES SUMMARY STATISTICS AND DIAGRAMS. 1. a) 45 B1 [1] b) 7 th value 37 M1 A1 [2]
1. a) 45 [1] b) 7 th value 37 [] n c) LQ : 4 = 3.5 4 th value so LQ = 5 3 n UQ : 4 = 9.75 10 th value so UQ = 45 IQR = 0 f.t. d) Median is closer to upper quartile Hence negative skew [] Page 1 . a) Orders
More informationMeasures of Variation. Section 2-5. Dotplots of Waiting Times. Waiting Times of Bank Customers at Different Banks in minutes. Bank of Providence
Measures of Variation Section -5 1 Waiting Times of Bank Customers at Different Banks in minutes Jefferson Valley Bank 6.5 6.6 6.7 6.8 7.1 7.3 7.4 Bank of Providence 4. 5.4 5.8 6. 6.7 8.5 9.3 10.0 Mean
More information3.3-Measures of Variation
3.3-Measures of Variation Variation: Variation is a measure of the spread or dispersion of a set of data from its center. Common methods of measuring variation include: 1. Range. Standard Deviation 3.
More informationSTAT 113 Variability
STAT 113 Variability Colin Reimer Dawson Oberlin College September 14, 2017 1 / 48 Outline Last Time: Shape and Center Variability Boxplots and the IQR Variance and Standard Deviaton Transformations 2
More informationFrequency Distribution and Summary Statistics
Frequency Distribution and Summary Statistics Dongmei Li Department of Public Health Sciences Office of Public Health Studies University of Hawai i at Mānoa Outline 1. Stemplot 2. Frequency table 3. Summary
More informationStat 101 Exam 1 - Embers Important Formulas and Concepts 1
1 Chapter 1 1.1 Definitions Stat 101 Exam 1 - Embers Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2.
More information2 DESCRIPTIVE STATISTICS
Chapter 2 Descriptive Statistics 47 2 DESCRIPTIVE STATISTICS Figure 2.1 When you have large amounts of data, you will need to organize it in a way that makes sense. These ballots from an election are rolled
More informationPrepared By. Handaru Jati, Ph.D. Universitas Negeri Yogyakarta.
Prepared By Handaru Jati, Ph.D Universitas Negeri Yogyakarta handaru@uny.ac.id Chapter 7 Statistical Analysis with Excel Chapter Overview 7.1 Introduction 7.2 Understanding Data 7.2.1 Descriptive Statistics
More informationMoments and Measures of Skewness and Kurtosis
Moments and Measures of Skewness and Kurtosis Moments The term moment has been taken from physics. The term moment in statistical use is analogous to moments of forces in physics. In statistics the values
More informationMeasures of Central tendency
Elementary Statistics Measures of Central tendency By Prof. Mirza Manzoor Ahmad In statistics, a central tendency (or, more commonly, a measure of central tendency) is a central or typical value for a
More informationDescriptive Statistics (Devore Chapter One)
Descriptive Statistics (Devore Chapter One) 1016-345-01 Probability and Statistics for Engineers Winter 2010-2011 Contents 0 Perspective 1 1 Pictorial and Tabular Descriptions of Data 2 1.1 Stem-and-Leaf
More information3.5 Applying the Normal Distribution (Z-Scores)
3.5 Applying the Normal Distribution (Z-Scores) The Graph: Review of the Normal Distribution Properties: - it is symmetrical; the mean, median and mode are equal and fall at the line of symmetry - it is
More informationSTAT:2010 Statistical Methods and Computing. Using density curves to describe the distribution of values of a quantitative
STAT:10 Statistical Methods and Computing Normal Distributions Lecture 4 Feb. 6, 17 Kate Cowles 374 SH, 335-0727 kate-cowles@uiowa.edu 1 2 Using density curves to describe the distribution of values of
More informationMisleading Graphs. Examples Compare unlike quantities Truncate the y-axis Improper scaling Chart Junk Impossible to interpret
Misleading Graphs Examples Compare unlike quantities Truncate the y-axis Improper scaling Chart Junk Impossible to interpret 1 Pretty Bleak Picture Reported AIDS cases 2 But Wait..! 3 Turk Incorporated
More information4. DESCRIPTIVE STATISTICS
4. DESCRIPTIVE STATISTICS Descriptive Statistics is a body of techniques for summarizing and presenting the essential information in a data set. Eg: Here are daily high temperatures for Jan 16, 2009 in
More informationReview: Types of Summary Statistics
Review: Types of Summary Statistics We re often interested in describing the following characteristics of the distribution of a data series: Central tendency - where is the middle of the distribution?
More informationMath146 - Chapter 3 Handouts. The Greek Alphabet. Source: Page 1 of 39
Source: www.mathwords.com The Greek Alphabet Page 1 of 39 Some Miscellaneous Tips on Calculations Examples: Round to the nearest thousandth 0.92431 0.75693 CAUTION! Do not truncate numbers! Example: 1
More informationChapter 6. y y. Standardizing with z-scores. Standardizing with z-scores (cont.)
Starter Ch. 6: A z-score Analysis Starter Ch. 6 Your Statistics teacher has announced that the lower of your two tests will be dropped. You got a 90 on test 1 and an 85 on test 2. You re all set to drop
More informationNumerical summary of data
Numerical summary of data Introduction to Statistics Measures of location: mode, median, mean, Measures of spread: range, interquartile range, standard deviation, Measures of form: skewness, kurtosis,
More informationHandout 4 numerical descriptive measures part 2. Example 1. Variance and Standard Deviation for Grouped Data. mf N 535 = = 25
Handout 4 numerical descriptive measures part Calculating Mean for Grouped Data mf Mean for population data: µ mf Mean for sample data: x n where m is the midpoint and f is the frequency of a class. Example
More informationSTATISTICAL DISTRIBUTIONS AND THE CALCULATOR
STATISTICAL DISTRIBUTIONS AND THE CALCULATOR 1. Basic data sets a. Measures of Center - Mean ( ): average of all values. Characteristic: non-resistant is affected by skew and outliers. - Median: Either
More informationExploring Data and Graphics
Exploring Data and Graphics Rick White Department of Statistics, UBC Graduate Pathways to Success Graduate & Postdoctoral Studies November 13, 2013 Outline Summarizing Data Types of Data Visualizing Data
More informationBasic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 3: April 25, Abstract
Basic Data Analysis Stephen Turnbull Business Administration and Public Policy Lecture 3: April 25, 2013 Abstract Review summary statistics and measures of location. Discuss the placement exam as an exercise
More informationA CLEAR UNDERSTANDING OF THE INDUSTRY
A CLEAR UNDERSTANDING OF THE INDUSTRY IS CFA INSTITUTE INVESTMENT FOUNDATIONS RIGHT FOR YOU? Investment Foundations is a certificate program designed to give you a clear understanding of the investment
More informationChapter 4. The Normal Distribution
Chapter 4 The Normal Distribution 1 Chapter 4 Overview Introduction 4-1 Normal Distributions 4-2 Applications of the Normal Distribution 4-3 The Central Limit Theorem 4-4 The Normal Approximation to the
More informationTerms & Characteristics
NORMAL CURVE Knowledge that a variable is distributed normally can be helpful in drawing inferences as to how frequently certain observations are likely to occur. NORMAL CURVE A Normal distribution: Distribution
More informationThe Normal Distribution
Stat 6 Introduction to Business Statistics I Spring 009 Professor: Dr. Petrutza Caragea Section A Tuesdays and Thursdays 9:300:50 a.m. Chapter, Section.3 The Normal Distribution Density Curves So far we
More informationMath 2200 Fall 2014, Exam 1 You may use any calculator. You may not use any cheat sheet.
1 Math 2200 Fall 2014, Exam 1 You may use any calculator. You may not use any cheat sheet. Warning to the Reader! If you are a student for whom this document is a historical artifact, be aware that the
More informationDescriptive Statistics Bios 662
Descriptive Statistics Bios 662 Michael G. Hudgens, Ph.D. mhudgens@bios.unc.edu http://www.bios.unc.edu/ mhudgens 2008-08-19 08:51 BIOS 662 1 Descriptive Statistics Descriptive Statistics Types of variables
More information2018 AAPM: Normal and non normal distributions: Why understanding distributions are important when designing experiments and analyzing data
Statistical Failings that Keep Us All in the Dark Normal and non normal distributions: Why understanding distributions are important when designing experiments and Conflict of Interest Disclosure I have
More informationChapter 3. Descriptive Measures. Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 3, Slide 1
Chapter 3 Descriptive Measures Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 3, Slide 1 Chapter 3 Descriptive Measures Mean, Median and Mode Copyright 2016, 2012, 2008 Pearson Education, Inc.
More informationSOLUTIONS TO THE LAB 1 ASSIGNMENT
SOLUTIONS TO THE LAB 1 ASSIGNMENT Question 1 Excel produces the following histogram of pull strengths for the 100 resistors: 2 20 Histogram of Pull Strengths (lb) Frequency 1 10 0 9 61 63 6 67 69 71 73
More informationModule Tag PSY_P2_M 7. PAPER No.2: QUANTITATIVE METHODS MODULE No.7: NORMAL DISTRIBUTION
Subject Paper No and Title Module No and Title Paper No.2: QUANTITATIVE METHODS Module No.7: NORMAL DISTRIBUTION Module Tag PSY_P2_M 7 TABLE OF CONTENTS 1. Learning Outcomes 2. Introduction 3. Properties
More informationDESCRIPTIVE STATISTICS
DESCRIPTIVE STATISTICS INTRODUCTION Numbers and quantification offer us a very special language which enables us to express ourselves in exact terms. This language is called Mathematics. We will now learn
More informationSection 6-1 : Numerical Summaries
MAT 2377 (Winter 2012) Section 6-1 : Numerical Summaries With a random experiment comes data. In these notes, we learn techniques to describe the data. Data : We will denote the n observations of the random
More informationChapter 2: Descriptive Statistics. Mean (Arithmetic Mean): Found by adding the data values and dividing the total by the number of data.
-3: Measure of Central Tendency Chapter : Descriptive Statistics The value at the center or middle of a data set. It is a tool for analyzing data. Part 1: Basic concepts of Measures of Center Ex. Data
More informationGraphical and Tabular Methods in Descriptive Statistics. Descriptive Statistics
Graphical and Tabular Methods in Descriptive Statistics MATH 3342 Section 1.2 Descriptive Statistics n Graphs and Tables n Numerical Summaries Sections 1.3 and 1.4 1 Why graph data? n The amount of data
More informationSUMMARY STATISTICS EXAMPLES AND ACTIVITIES
Session 6 SUMMARY STATISTICS EXAMPLES AD ACTIVITIES Example 1.1 Expand the following: 1. X 2. 2 6 5 X 3. X 2 4 3 4 4. X 4 2 Solution 1. 2 3 2 X X X... X 2. 6 4 X X X X 4 5 6 5 3. X 2 X 3 2 X 4 2 X 5 2
More informationMAS1403. Quantitative Methods for Business Management. Semester 1, Module leader: Dr. David Walshaw
MAS1403 Quantitative Methods for Business Management Semester 1, 2018 2019 Module leader: Dr. David Walshaw Additional lecturers: Dr. James Waldren and Dr. Stuart Hall Announcements: Written assignment
More informationCHAPTER 5 Sampling Distributions
CHAPTER 5 Sampling Distributions 5.1 The possible values of p^ are 0, 1/3, 2/3, and 1. These correspond to getting 0 persons with lung cancer, 1 with lung cancer, 2 with lung cancer, and all 3 with lung
More informationNormal Probability Distributions
Normal Probability Distributions Properties of Normal Distributions The most important probability distribution in statistics is the normal distribution. Normal curve A normal distribution is a continuous
More informationGetting to know a data-set (how to approach data) Overview: Descriptives & Graphing
Overview: Descriptives & Graphing 1. Getting to know a data set 2. LOM & types of statistics 3. Descriptive statistics 4. Normal distribution 5. Non-normal distributions 6. Effect of skew on central tendency
More information