Frequency Distribution and Summary Statistics
|
|
- Hope Shepherd
- 5 years ago
- Views:
Transcription
1 Frequency Distribution and Summary Statistics Dongmei Li Department of Public Health Sciences Office of Public Health Studies University of Hawai i at Mānoa
2 Outline 1. Stemplot 2. Frequency table 3. Summary statistics 2
3 1. Stem-and-leaf plots (stemplots) Always start by looking at the data with graphs and plots Our favorite technique for looking at a single variable is the stemplot A stemplot is a graphical technique that organizes data into a histogram-like display You can observe a lot by looking Yogi Berra 3
4 Stemplot Illustrative Example Select an SRS of 10 ages List data as an ordered array Divide each data point into a stem-value and leaf-value In this example the tens place will be the stem-value and the ones place will be the leaf value, e.g., 21 has a stem value of 2 and leaf value of 1 4
5 Stemplot illustration (cont.) Draw an axis for the stem-values: axis multiplier (important!) Place leaves next to their stem value 21 plotted (animation) 5
6 Stemplot illustration continued Plot all data points and rearrange in rank order: Here is the plot horizontally: (for demonstration purposes) Rotated stemplot 6
7 Interpreting Stemplots Shape Symmetry Modality (number of peaks) Kurtosis (width of tails) Departures (outliers) Location Gravitational center mean Middle value median Spread Range and inter-quartile range Standard deviation and variance 7
8 Shape Shape refers to the pattern when plotted Here s the silhouette of our data X X X X X X X X X X Consider: symmetry, modality, kurtosis 8
9 Shape: Idealized Density Curve A large dataset is introduced An density curve is superimposed to better discuss shape 9
10 Symmetrical Shapes 10
11 Asymmetrical shapes 11
12 Modality (no. of peaks) 12
13 Kurtosis (width of tails) Mesokurtic (medium) Platykurtic (flat) fat tails Leptokurtic (steep) skinny tails Kurtosis is not be easily judged by eye 13
14 Stemplot Second Example Data: 1.47, 2.06, 2.36, 3.43, 3.74, 3.78, 3.94, 4.42 Stem = ones-place Leaves = tenths-place Round to keep one digit after decimal point (e.g., ) Do not plot decimal ( 1) Shape: asymmetric, skewed to the left, unimodal, no outliers 14
15 Draw a stemplot using JMP Analyze---Distribution---Data---Stem and Leaf Open the JMP data set named Stem_and_leaf_plot.jmp 15
16 Third Illustrative Example (n = 26) Age data set from 26 subjects {14, 17, 18, 19, 22, 22, 23, 24, 24, 26, 26, 27, 28, 29, 30, 30, 30, 31, 32, 33, 34, 34, 35, 36, 37, 38} Data set: Stem_and_leaf_plot_example2.jmp Distribution of the age variable? 16
17 2. Frequency Table Frequency = count Relative frequency = proportion or % Cumulative frequency % less than or equal to level AGE Freq Rel.Freq Cum.Freq % 0.3% % 1.7% % 6.0% % 11.6% % 19.9% % 32.9% % 47.2% % 59.6% % 73.4% % 82.1% % 88.7% % 92.5% % 95.4% % 97.4% % 98.6% % 99.5% % 100.0% Total % 17
18 Frequency Table with Class Intervals When data are sparse, group data into class intervals Create 4 to 12 class intervals Classes can be uniform or non-uniform End-point convention: e.g., first class interval of 0 to 10 will include 0 but exclude 10 (0 to 9.99) Talley frequencies Calculate relative frequency Calculate cumulative frequency 18
19 Class Intervals Uniform class intervals table (width 10) for data: Class Freq Relative Freq. (%) Cumulative Freq (%) Total
20 Histogram A histogram is a frequency chart for a quantitative measurement. Notice how the bars touch _ Age Class 20
21 Bar Chart A bar chart with non-touching bars is reserved for categorical measurements and non-uniform class intervals Pre- Elem. Middle High School-level 21
22 3. Summary Statistics Central location Mean Median Mode Spread Range and interquartile range (IQR) Variance and standard deviation 22
23 Location: Mean Eye-ball method visualize where plot would balance Arithmetic method = sum values and divide by n ^ Grav.Center Eye-ball method around 25 to 30 (takes practice) Arithmetic method mean = 290 / 10 = 29 23
24 Notation n sample size X the variable (e.g., ages of subjects) x i the value of individual i for variable X sum all values (capital sigma) Illustrative data (ages of participants): n = 10 X = AGE variable x 1 = 21, x 2 = 42,, x 10 = 52 x i = x 1 + x x 10 = =
25 Central Location: Sample Mean Arithmetic average Traditional measure of central location Sum the values and divide by n xbar refers to the sample mean x 1 n x x 1 2 n 1 xn xi n i 1 25
26 Example: Sample Mean Ten individuals selected at random have the following ages: Note that n = 10, x i = = 290, and x 1 n x i 1 10 (290 ) 29.0 The sample mean is the gravitational center of a distribution Mean = 29 26
27 Uses of the Sample Mean The sample mean can be used to predict: The value of an observation drawn at random from the sample The value of an observation drawn at random from the population The population mean 27
28 Population Mean xi 1 N N x i Same operation as sample mean except based on entire population (N population size) Conceptually important Usually not available in practice Sometimes referred to as the expected value 28
29 Central Location: Median Ordered array: When n is even, the median is the average of the (n 2)th data and the (n 2+1)th data. When n is odd, the median is the ((n+1) 2)th data. For illustrative data: n = 10 the median falls between 27 and 28=(27+28) 2 = median Average the adjacent values: M =
30 More Examples of Medians Example A: Median = 4 Example B: Median = 5 (average of 4 and 6) Example C: Median 2 (Values must be ordered first) 30
31 The Median is Robust The median is more resistant to skews and outliers than the mean; it is more robust. This data set has a mean of 1636: Here s the same data set with a data entry error outlier (highlighted). This data set has a mean of 2743: The median is 1614 in both instances, demonstrating its robustness in the face of outliers. 31
32 Mode The mode is the most commonly encountered value in the dataset This data set has a mode of 7 {4, 7, 7, 7, 8, 8, 9} This data set has no mode {4, 6, 7, 8} (each point appears only once) The mode is useful only in large data sets with repeating values 32
33 Comparison of Mean, Median, Mode Note how the mean gets pulled toward the longer tail more than the median mean = median symmetrical distrib mean > median positive skew mean < median negative skew 33
34 Spread: Quartiles Two distributions can be quite different yet can have the same mean This data compares particulate matter in air samples (μg/m 3 ) at two sites. Both sites have a mean of 36, but Site 1 exhibits much greater variability. We would miss the high pollution days if we relied solely on the mean. Site 1 Site
35 Spread: Range Range = maximum minimum Illustrative example: Site 1 range = = 46 Site 2 range = = 8 Beware: the sample range will tend to underestimate the population range. Always supplement the range with at least one addition measure of spread Site 1 Site
36 Spread: Quartiles Quartile 1 (Q1): cuts off bottom quarter of data = median of the lower half of the data set Quartile 3 (Q3): cuts off top quarter of data = median of the upper half of the data set Interquartile Range (IQR) = Q3 Q1 covers the middle 50% of the distribution Q1 median Q3 Q1 = 21, Q3 = 42, and IQR = = 21 36
37 Quartiles (Tukey s Hinges) Example 2 Data are metabolic rates (cal/day), n = median When n is odd, include the median in both halves of the data set. Bottom half: which has a median of (Q1) Top half: which has a median of 1729 (Q3) 37
38 Five-Point Summary Q0 (the minimum) Q1 (25 th percentile) Q2 (median) Q3 (75 th percentile) Q4 (the maximum) 38
39 Boxplots 1. Calculate 5-point summary. Draw box from Q1 to Q3 w/ line at median 2. Calculate IQR and fences as follows: Fence Lower = Q1 1.5(IQR) Fence Upper = Q (IQR) Do not draw fences 3. Determine if any values lie outside the fences (outside values). If so, plot these separately. 4. Determine values inside the fences (inside values) Draw whisker from Q3 to upper inside value. Draw whisker from Q1 to lower inside value 39
40 Illustrative Example: Boxplot Data: pt summary: {5, 21, 27.5, 42, 52}; box from 21 to 42 with IQR = = 21. F U = Q (IQR) = 42 + (1.5)(21) = 73.5 F L = Q1 1.5(IQR) = 21 (1.5)(21) = None values above upper fence None values below lower fence 4. Upper inside value = 52 Lower inside value = 5 Draws whiskers Upper inside = 52 Q3 = 42 Q2 = 27.5 Q1 = 21 Lower inside = 5 40
41 Illustrative Example: Boxplot 2 Data: point summary: 3, 22, 25.5, 29, 51: draw box 2. IQR = = 7 F U = Q (IQR) = 29 + (1.5)(7) = 39.5 F L = Q1 1.5(IQR) = 22 (1.5)(7) = One above top fence (51) One below bottom fence (3) O utside v alue (5 1) In side valu e (31) Up per hin ge (29) Me dian (25.5 ) Lo wer h ing e (2 2) In side valu e (21) O utside v alue (3 ) 4. Upper inside value is 31 0 Lower inside value is 21 Draw whiskers 41
42 Illustrative Example: Boxplot 3 Seven metabolic rates: point summary: 1362, , 1614, 1729, IQR = = F U = Q (IQR) = (1.5)(279.5) = F L = Q1 1.5(IQR) = (1.5)(279.5) = None outside 4. Whiskers 1867 and 1362 N = 7 Data source: Moore, 42
43 Boxplots: Interpretation Location Position of median Position of box Spread Hinge-spread (IQR) Whisker-to-whisker spread Range Shape Symmetry or direction of skew Long whiskers (tails) indicate leptokurtosis (Long tails?) 43
44 Side-by-side boxplots Boxplots are especially useful when comparing groups 44
45 Spread: Standard Deviation Most common descriptive measures of spread Based on deviations around the mean. This figure demonstrates the deviations of two of its values This data set has a mean of 36. The data point 33 has a deviation of = 3. The data point 40 has a deviation of = 4. 45
46 Variance and Standard Deviation Deviation = x i x Sum of squared deviations = SS x x 2 i Sample variance = s 2 n SS 1 Sample standard deviation = s s 2 46
47 Standard deviation (formula) Sum of Squares s n 1 1 ( x i x ) 2 Sample standard deviation s is the estimator of population standard deviation. 47
48 Illustrative Example: Standard Deviation Observation Deviations Squared deviations x i x i x x 2 i x = = = = = = = = = = = 2 ( 2) 2 = = 3 ( 3) 2 = = 4 ( 4) 2 = 16 SUMS 0* SS = 58 * Sum of deviations always equals zero 48
49 Illustrative Example (cont.) Sample variance (s 2 ) s 2 SS ( n g/m 3 ) 2 Standard deviation (s) s s g/m 3 49
50 Interpretation of Standard Deviation Measure spread (e.g., if group was s 1 = 15 and group 2 s 2 = 10, group 1 has more spread, i.e., variability) rule (next slide) Chebychev s rule (two slides hence) 50
51 Rule Normal Distributions Only! 68% of data in the range μ ± σ 95% of data in the range μ ± 2σ 99.7% of data in the range μ ± 3σ Example. Suppose a variable has a Normal distribution with μ = 30 and σ = 10. Then: 68% of values are between 30 ± 10 = 20 to 40 95% are between 30 ± (2)(10) = 30 ± 20 = 10 to % are between 30 ± (3)(10) = 30 ± 30 = 0 to 60 51
52 Chebychev s Rule All Distributions Chebychev s rule says that at least 75% of the values will fall in the range μ ± 2σ (for any shaped distribution) Example: A distribution with μ = 30 and σ = 10 has at least 75% of the values in the range 30 ± (2)(10) = 10 to 50 52
53 Rules for Rounding Carry at least four significant digits during calculations. Round at last step of operation Always report units Always use common sense and good judgment. 53
54 Choosing Summary Statistics Always report a measure of central location, a measure of spread, and the sample size Symmetrical mound-shaped distributions report mean and standard deviation Odd shaped distributions report 5- point summaries (or median and IQR) 54
55 Software and Calculators Use software and calculators to check work. 55
56 Excel Data Analysis ToolPak Data set: Boxplot.xlsx Summary statistics using Excel 56
57 Excel Data Analysis ToolPak Get summary statistics using Excel 57
58 Boxplot using Excel Youtube links to get boxplot in Excel 4PVarwE&feature=related Female age Male age 58
59 Boxplot and summary statistics by JMP Data set: Boxplot.jmp What do you say about the comparison of distributions of ages for females and males? 59
60 60
61 Exercise Surgical times. Durations of surgeries (hours) for 15 patients receiving artificial hearts are shown here. Create a stem plot of these data. Describe the distribution. Are there any outliers? What is the standard deviation of this data set? Draw a boxplot based on this data set Data set: Presentation2_Exercise.xlsx Presentation2_exercise.jmp 61
1 Describing Distributions with numbers
1 Describing Distributions with numbers Only for quantitative variables!! 1.1 Describing the center of a data set The mean of a set of numerical observation is the familiar arithmetic average. To write
More informationMath 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment
Math 2311 Bekki George bekki@math.uh.edu Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment Class webpage: http://www.math.uh.edu/~bekki/math2311.html Math 2311 Class
More information2 Exploring Univariate Data
2 Exploring Univariate Data A good picture is worth more than a thousand words! Having the data collected we examine them to get a feel for they main messages and any surprising features, before attempting
More informationLecture 2 Describing Data
Lecture 2 Describing Data Thais Paiva STA 111 - Summer 2013 Term II July 2, 2013 Lecture Plan 1 Types of data 2 Describing the data with plots 3 Summary statistics for central tendency and spread 4 Histograms
More informationChapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1
Chapter 3 Numerical Descriptive Measures Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1 Objectives In this chapter, you learn to: Describe the properties of central tendency, variation, and
More informationDATA SUMMARIZATION AND VISUALIZATION
APPENDIX DATA SUMMARIZATION AND VISUALIZATION PART 1 SUMMARIZATION 1: BUILDING BLOCKS OF DATA ANALYSIS 294 PART 2 PART 3 PART 4 VISUALIZATION: GRAPHS AND TABLES FOR SUMMARIZING AND ORGANIZING DATA 296
More informationFundamentals of Statistics
CHAPTER 4 Fundamentals of Statistics Expected Outcomes Know the difference between a variable and an attribute. Perform mathematical calculations to the correct number of significant figures. Construct
More informationLecture 1: Review and Exploratory Data Analysis (EDA)
Lecture 1: Review and Exploratory Data Analysis (EDA) Ani Manichaikul amanicha@jhsph.edu 16 April 2007 1 / 40 Course Information I Office hours For questions and help When? I ll announce this tomorrow
More informationLecture Week 4 Inspecting Data: Distributions
Lecture Week 4 Inspecting Data: Distributions Introduction to Research Methods & Statistics 2013 2014 Hemmo Smit So next week No lecture & workgroups But Practice Test on-line (BB) Enter data for your
More informationappstats5.notebook September 07, 2016 Chapter 5
Chapter 5 Describing Distributions Numerically Chapter 5 Objective: Students will be able to use statistics appropriate to the shape of the data distribution to compare of two or more different data sets.
More informationSome Characteristics of Data
Some Characteristics of Data Not all data is the same, and depending on some characteristics of a particular dataset, there are some limitations as to what can and cannot be done with that data. Some key
More information9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives
Basic Statistics for the Healthcare Professional 1 F R A N K C O H E N, M B B, M P A D I R E C T O R O F A N A L Y T I C S D O C T O R S M A N A G E M E N T, LLC Purpose of Statistic 2 Provide a numerical
More informationMeasures of Central Tendency Lecture 5 22 February 2006 R. Ryznar
Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar Today s Content Wrap-up from yesterday Frequency Distributions The Mean, Median and Mode Levels of Measurement and Measures of Central
More informationHandout 4 numerical descriptive measures part 2. Example 1. Variance and Standard Deviation for Grouped Data. mf N 535 = = 25
Handout 4 numerical descriptive measures part Calculating Mean for Grouped Data mf Mean for population data: µ mf Mean for sample data: x n where m is the midpoint and f is the frequency of a class. Example
More informationOverview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution
PSY 464 Advanced Experimental Design Describing and Exploring Data The Normal Distribution 1 Overview/Outline Questions-problems? Exploring/Describing data Organizing/summarizing data Graphical presentations
More informationSTAT 113 Variability
STAT 113 Variability Colin Reimer Dawson Oberlin College September 14, 2017 1 / 48 Outline Last Time: Shape and Center Variability Boxplots and the IQR Variance and Standard Deviaton Transformations 2
More informationBasic Procedure for Histograms
Basic Procedure for Histograms 1. Compute the range of observations (min. & max. value) 2. Choose an initial # of classes (most likely based on the range of values, try and find a number of classes that
More informationUnit 2 Statistics of One Variable
Unit 2 Statistics of One Variable Day 6 Summarizing Quantitative Data Summarizing Quantitative Data We have discussed how to display quantitative data in a histogram It is useful to be able to describe
More informationCategorical. A general name for non-numerical data; the data is separated into categories of some kind.
Chapter 5 Categorical A general name for non-numerical data; the data is separated into categories of some kind. Nominal data Categorical data with no implied order. Eg. Eye colours, favourite TV show,
More informationDescriptive Statistics
Petra Petrovics Descriptive Statistics 2 nd seminar DESCRIPTIVE STATISTICS Definition: Descriptive statistics is concerned only with collecting and describing data Methods: - statistical tables and graphs
More informationDavid Tenenbaum GEOG 090 UNC-CH Spring 2005
Simple Descriptive Statistics Review and Examples You will likely make use of all three measures of central tendency (mode, median, and mean), as well as some key measures of dispersion (standard deviation,
More informationWeek 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.
Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics. Convergent validity: the degree to which results/evidence from different tests/sources, converge on the same conclusion.
More informationSimple Descriptive Statistics
Simple Descriptive Statistics These are ways to summarize a data set quickly and accurately The most common way of describing a variable distribution is in terms of two of its properties: Central tendency
More informationStat 101 Exam 1 - Embers Important Formulas and Concepts 1
1 Chapter 1 1.1 Definitions Stat 101 Exam 1 - Embers Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2.
More informationCHAPTER 2 Describing Data: Numerical
CHAPTER Multiple-Choice Questions 1. A scatter plot can illustrate all of the following except: A) the median of each of the two variables B) the range of each of the two variables C) an indication of
More informationSection3-2: Measures of Center
Chapter 3 Section3-: Measures of Center Notation Suppose we are making a series of observations, n of them, to be exact. Then we write x 1, x, x 3,K, x n as the values we observe. Thus n is the total number
More informationGraphical and Tabular Methods in Descriptive Statistics. Descriptive Statistics
Graphical and Tabular Methods in Descriptive Statistics MATH 3342 Section 1.2 Descriptive Statistics n Graphs and Tables n Numerical Summaries Sections 1.3 and 1.4 1 Why graph data? n The amount of data
More informationSTAB22 section 1.3 and Chapter 1 exercises
STAB22 section 1.3 and Chapter 1 exercises 1.101 Go up and down two times the standard deviation from the mean. So 95% of scores will be between 572 (2)(51) = 470 and 572 + (2)(51) = 674. 1.102 Same idea
More informationDescribing Data: One Quantitative Variable
STAT 250 Dr. Kari Lock Morgan The Big Picture Describing Data: One Quantitative Variable Population Sampling SECTIONS 2.2, 2.3 One quantitative variable (2.2, 2.3) Statistical Inference Sample Descriptive
More information3.1 Measures of Central Tendency
3.1 Measures of Central Tendency n Summation Notation x i or x Sum observation on the variable that appears to the right of the summation symbol. Example 1 Suppose the variable x i is used to represent
More informationSOLUTIONS TO THE LAB 1 ASSIGNMENT
SOLUTIONS TO THE LAB 1 ASSIGNMENT Question 1 Excel produces the following histogram of pull strengths for the 100 resistors: 2 20 Histogram of Pull Strengths (lb) Frequency 1 10 0 9 61 63 6 67 69 71 73
More informationDescription of Data I
Description of Data I (Summary and Variability measures) Objectives: Able to understand how to summarize the data Able to understand how to measure the variability of the data Able to use and interpret
More informationLectures delivered by Prof.K.K.Achary, YRC
Lectures delivered by Prof.K.K.Achary, YRC Given a data set, we say that it is symmetric about a central value if the observations are distributed symmetrically about the central value. In symmetrically
More informationMeasures of Center. Mean. 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) Measure of Center. Notation. Mean
Measure of Center Measures of Center The value at the center or middle of a data set 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) 1 2 Mean Notation The measure of center obtained by adding the values
More informationChapter 3. Descriptive Measures. Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 3, Slide 1
Chapter 3 Descriptive Measures Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 3, Slide 1 Chapter 3 Descriptive Measures Mean, Median and Mode Copyright 2016, 2012, 2008 Pearson Education, Inc.
More informationThe Range, the Inter Quartile Range (or IQR), and the Standard Deviation (which we usually denote by a lower case s).
We will look the three common and useful measures of spread. The Range, the Inter Quartile Range (or IQR), and the Standard Deviation (which we usually denote by a lower case s). 1 Ameasure of the center
More informationNumerical Descriptions of Data
Numerical Descriptions of Data Measures of Center Mean x = x i n Excel: = average ( ) Weighted mean x = (x i w i ) w i x = data values x i = i th data value w i = weight of the i th data value Median =
More informationDescriptive Statistics (Devore Chapter One)
Descriptive Statistics (Devore Chapter One) 1016-345-01 Probability and Statistics for Engineers Winter 2010-2011 Contents 0 Perspective 1 1 Pictorial and Tabular Descriptions of Data 2 1.1 Stem-and-Leaf
More informationNOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS
NOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS A box plot is a pictorial representation of the data and can be used to get a good idea and a clear picture about the distribution of the data. It shows
More informationDescriptive Statistics
Chapter 3 Descriptive Statistics Chapter 2 presented graphical techniques for organizing and displaying data. Even though such graphical techniques allow the researcher to make some general observations
More informationSummarising Data. Summarising Data. Examples of Types of Data. Types of Data
Summarising Data Summarising Data Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester Today we will consider Different types of data Appropriate ways to summarise these data 17/10/2017
More informationAP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE
AP STATISTICS Name: FALL SEMESTSER FINAL EXAM STUDY GUIDE Period: *Go over Vocabulary Notecards! *This is not a comprehensive review you still should look over your past notes, homework/practice, Quizzes,
More informationE.D.A. Exploratory Data Analysis E.D.A. Steps for E.D.A. Greg C Elvers, Ph.D.
E.D.A. Greg C Elvers, Ph.D. 1 Exploratory Data Analysis One of the most important steps in analyzing data is to look at the raw data This allows you to: find observations that may be incorrect quickly
More informationEmpirical Rule (P148)
Interpreting the Standard Deviation Numerical Descriptive Measures for Quantitative data III Dr. Tom Ilvento FREC 408 We can use the standard deviation to express the proportion of cases that might fall
More informationChapter 2: Descriptive Statistics. Mean (Arithmetic Mean): Found by adding the data values and dividing the total by the number of data.
-3: Measure of Central Tendency Chapter : Descriptive Statistics The value at the center or middle of a data set. It is a tool for analyzing data. Part 1: Basic concepts of Measures of Center Ex. Data
More informationCSC Advanced Scientific Programming, Spring Descriptive Statistics
CSC 223 - Advanced Scientific Programming, Spring 2018 Descriptive Statistics Overview Statistics is the science of collecting, organizing, analyzing, and interpreting data in order to make decisions.
More informationDot Plot: A graph for displaying a set of data. Each numerical value is represented by a dot placed above a horizontal number line.
Introduction We continue our study of descriptive statistics with measures of dispersion, such as dot plots, stem and leaf displays, quartiles, percentiles, and box plots. Dot plots, a stem-and-leaf display,
More informationSection 6-1 : Numerical Summaries
MAT 2377 (Winter 2012) Section 6-1 : Numerical Summaries With a random experiment comes data. In these notes, we learn techniques to describe the data. Data : We will denote the n observations of the random
More informationMeasures of Dispersion (Range, standard deviation, standard error) Introduction
Measures of Dispersion (Range, standard deviation, standard error) Introduction We have already learnt that frequency distribution table gives a rough idea of the distribution of the variables in a sample
More information22.2 Shape, Center, and Spread
Name Class Date 22.2 Shape, Center, and Spread Essential Question: Which measures of center and spread are appropriate for a normal distribution, and which are appropriate for a skewed distribution? Eplore
More informationTerms & Characteristics
NORMAL CURVE Knowledge that a variable is distributed normally can be helpful in drawing inferences as to how frequently certain observations are likely to occur. NORMAL CURVE A Normal distribution: Distribution
More informationDescriptive Statistics Bios 662
Descriptive Statistics Bios 662 Michael G. Hudgens, Ph.D. mhudgens@bios.unc.edu http://www.bios.unc.edu/ mhudgens 2008-08-19 08:51 BIOS 662 1 Descriptive Statistics Descriptive Statistics Types of variables
More informationThe Normal Distribution
Stat 6 Introduction to Business Statistics I Spring 009 Professor: Dr. Petrutza Caragea Section A Tuesdays and Thursdays 9:300:50 a.m. Chapter, Section.3 The Normal Distribution Density Curves So far we
More informationEngineering Mathematics III. Moments
Moments Mean and median Mean value (centre of gravity) f(x) x f (x) x dx Median value (50th percentile) F(x med ) 1 2 P(x x med ) P(x x med ) 1 0 F(x) x med 1/2 x x Variance and standard deviation
More informationIntroduction to Descriptive Statistics
Introduction to Descriptive Statistics 17.871 Types of Variables ~Nominal (Quantitative) Nominal (Qualitative) categorical Ordinal Interval or ratio Describing data Moment Non-mean based measure Center
More informationMath 2200 Fall 2014, Exam 1 You may use any calculator. You may not use any cheat sheet.
1 Math 2200 Fall 2014, Exam 1 You may use any calculator. You may not use any cheat sheet. Warning to the Reader! If you are a student for whom this document is a historical artifact, be aware that the
More informationStandardized Data Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis
Descriptive Statistics (Part 2) 4 Chapter Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis McGraw-Hill/Irwin Copyright 2009 by The McGraw-Hill Companies, Inc. Chebyshev s Theorem
More informationData that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc.
Chapter 8 Measures of Center Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc. Data that can only be integer
More informationSome estimates of the height of the podium
Some estimates of the height of the podium 24 36 40 40 40 41 42 44 46 48 50 53 65 98 1 5 number summary Inter quartile range (IQR) range = max min 2 1.5 IQR outlier rule 3 make a boxplot 24 36 40 40 40
More information4. DESCRIPTIVE STATISTICS
4. DESCRIPTIVE STATISTICS Descriptive Statistics is a body of techniques for summarizing and presenting the essential information in a data set. Eg: Here are daily high temperatures for Jan 16, 2009 in
More informationDATA HANDLING Five-Number Summary
DATA HANDLING Five-Number Summary The five-number summary consists of the minimum and maximum values, the median, and the upper and lower quartiles. The minimum and the maximum are the smallest and greatest
More informationLecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1
Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 6 Normal Probability Distributions 6-1 Overview 6-2 The Standard Normal Distribution
More informationData Distributions and Normality
Data Distributions and Normality Definition (Non)Parametric Parametric statistics assume that data come from a normal distribution, and make inferences about parameters of that distribution. These statistical
More informationMoments and Measures of Skewness and Kurtosis
Moments and Measures of Skewness and Kurtosis Moments The term moment has been taken from physics. The term moment in statistical use is analogous to moments of forces in physics. In statistics the values
More informationMeasures of Central tendency
Elementary Statistics Measures of Central tendency By Prof. Mirza Manzoor Ahmad In statistics, a central tendency (or, more commonly, a measure of central tendency) is a central or typical value for a
More informationDESCRIPTIVE STATISTICS
DESCRIPTIVE STATISTICS INTRODUCTION Numbers and quantification offer us a very special language which enables us to express ourselves in exact terms. This language is called Mathematics. We will now learn
More informationA LEVEL MATHEMATICS ANSWERS AND MARKSCHEMES SUMMARY STATISTICS AND DIAGRAMS. 1. a) 45 B1 [1] b) 7 th value 37 M1 A1 [2]
1. a) 45 [1] b) 7 th value 37 [] n c) LQ : 4 = 3.5 4 th value so LQ = 5 3 n UQ : 4 = 9.75 10 th value so UQ = 45 IQR = 0 f.t. d) Median is closer to upper quartile Hence negative skew [] Page 1 . a) Orders
More informationSampling and Descriptive Statistics
Sampling and Descriptive Statistics Berlin Chen Department of Computer Science & Information Engineering National Taiwan Normal University Reference: 1. W. Navidi. Statistics for Engineering and Scientists.
More informationStatistics I Chapter 2: Analysis of univariate data
Statistics I Chapter 2: Analysis of univariate data Numerical summary Central tendency Location Spread Form mean quartiles range coeff. asymmetry median percentiles interquartile range coeff. kurtosis
More informationMonte Carlo Simulation (Random Number Generation)
Monte Carlo Simulation (Random Number Generation) Revised: 10/11/2017 Summary... 1 Data Input... 1 Analysis Options... 6 Summary Statistics... 6 Box-and-Whisker Plots... 7 Percentiles... 9 Quantile Plots...
More informationSTATISTICAL DISTRIBUTIONS AND THE CALCULATOR
STATISTICAL DISTRIBUTIONS AND THE CALCULATOR 1. Basic data sets a. Measures of Center - Mean ( ): average of all values. Characteristic: non-resistant is affected by skew and outliers. - Median: Either
More informationCenter and Spread. Measures of Center and Spread. Example: Mean. Mean: the balance point 2/22/2009. Describing Distributions with Numbers.
Chapter 3 Section3-: Measures of Center Section 3-3: Measurers of Variation Section 3-4: Measures of Relative Standing Section 3-5: Exploratory Data Analysis Describing Distributions with Numbers The overall
More informationNOTES: Chapter 4 Describing Data
NOTES: Chapter 4 Describing Data Intro to Statistics COLYER Spring 2017 Student Name: Page 2 Section 4.1 ~ What is Average? Objective: In this section you will understand the difference between the three
More informationMath Take Home Quiz on Chapter 2
Math 116 - Take Home Quiz on Chapter 2 Show the calculations that lead to the answer. Due date: Tuesday June 6th Name Time your class meets Provide an appropriate response. 1) A newspaper surveyed its
More informationNumerical summary of data
Numerical summary of data Introduction to Statistics Measures of location: mode, median, mean, Measures of spread: range, interquartile range, standard deviation, Measures of form: skewness, kurtosis,
More informationExploratory Data Analysis
Exploratory Data Analysis Stemplots (or Stem-and-leaf plots) Stemplot and Boxplot T -- leading digits are called stems T -- final digits are called leaves STAT 74 Descriptive Statistics 2 Example: (number
More informationMEASURES OF DISPERSION, RELATIVE STANDING AND SHAPE. Dr. Bijaya Bhusan Nanda,
MEASURES OF DISPERSION, RELATIVE STANDING AND SHAPE Dr. Bijaya Bhusan Nanda, CONTENTS What is measures of dispersion? Why measures of dispersion? How measures of dispersions are calculated? Range Quartile
More informationModule Tag PSY_P2_M 7. PAPER No.2: QUANTITATIVE METHODS MODULE No.7: NORMAL DISTRIBUTION
Subject Paper No and Title Module No and Title Paper No.2: QUANTITATIVE METHODS Module No.7: NORMAL DISTRIBUTION Module Tag PSY_P2_M 7 TABLE OF CONTENTS 1. Learning Outcomes 2. Introduction 3. Properties
More information1 Exercise One. 1.1 Calculate the mean ROI. Note that the data is not grouped! Below you find the raw data in tabular form:
1 Exercise One Note that the data is not grouped! 1.1 Calculate the mean ROI Below you find the raw data in tabular form: Obs Data 1 18.5 2 18.6 3 17.4 4 12.2 5 19.7 6 5.6 7 7.7 8 9.8 9 19.9 10 9.9 11
More informationVariance, Standard Deviation Counting Techniques
Variance, Standard Deviation Counting Techniques Section 1.3 & 2.1 Cathy Poliak, Ph.D. cathy@math.uh.edu Department of Mathematics University of Houston 1 / 52 Outline 1 Quartiles 2 The 1.5IQR Rule 3 Understanding
More informationKING FAHD UNIVERSITY OF PETROLEUM & MINERALS DEPARTMENT OF MATHEMATICAL SCIENCES DHAHRAN, SAUDI ARABIA. Name: ID# Section
KING FAHD UNIVERSITY OF PETROLEUM & MINERALS DEPARTMENT OF MATHEMATICAL SCIENCES DHAHRAN, SAUDI ARABIA STAT 11: BUSINESS STATISTICS I Semester 04 Major Exam #1 Sunday March 7, 005 Please circle your instructor
More information2CORE. Summarising numerical data: the median, range, IQR and box plots
C H A P T E R 2CORE Summarising numerical data: the median, range, IQR and box plots How can we describe a distribution with just one or two statistics? What is the median, how is it calculated and what
More informationEdexcel past paper questions
Edexcel past paper questions Statistics 1 Chapters 2-4 (Discrete) Statistics 1 Chapters 2-4 (Discrete) Page 1 Stem and leaf diagram Stem-and-leaf diagrams are used to represent data in its original form.
More informationGetting to know a data-set (how to approach data) Overview: Descriptives & Graphing
Overview: Descriptives & Graphing 1. Getting to know a data set 2. LOM & types of statistics 3. Descriptive statistics 4. Normal distribution 5. Non-normal distributions 6. Effect of skew on central tendency
More informationSteps with data (how to approach data)
Descriptives & Graphing Lecture 3 Survey Research & Design in Psychology James Neill, 216 Creative Commons Attribution 4. Overview: Descriptives & Graphing 1. Steps with data 2. Level of measurement &
More informationMEASURES OF CENTRAL TENDENCY & VARIABILITY + NORMAL DISTRIBUTION
MEASURES OF CENTRAL TENDENCY & VARIABILITY + NORMAL DISTRIBUTION 1 Day 3 Summer 2017.07.31 DISTRIBUTION Symmetry Modality 单峰, 双峰 Skewness 正偏或负偏 Kurtosis 2 3 CHAPTER 4 Measures of Central Tendency 集中趋势
More information2 DESCRIPTIVE STATISTICS
Chapter 2 Descriptive Statistics 47 2 DESCRIPTIVE STATISTICS Figure 2.1 When you have large amounts of data, you will need to organize it in a way that makes sense. These ballots from an election are rolled
More informationAverages and Variability. Aplia (week 3 Measures of Central Tendency) Measures of central tendency (averages)
Chapter 4 Averages and Variability Aplia (week 3 Measures of Central Tendency) Chapter 5 (omit 5.2, 5.6, 5.8, 5.9) Aplia (week 4 Measures of Variability) Measures of central tendency (averages) Measures
More informationMeasures of Central Tendency: Ungrouped Data. Mode. Median. Mode -- Example. Median: Example with an Odd Number of Terms
Measures of Central Tendency: Ungrouped Data Measures of central tendency yield information about particular places or locations in a group of numbers. Common Measures of Location Mode Median Percentiles
More informationSTAT 157 HW1 Solutions
STAT 157 HW1 Solutions http://www.stat.ucla.edu/~dinov/courses_students.dir/10/spring/stats157.dir/ Problem 1. 1.a: (6 points) Determine the Relative Frequency and the Cumulative Relative Frequency (fill
More informationSkewness and the Mean, Median, and Mode *
OpenStax-CNX module: m46931 1 Skewness and the Mean, Median, and Mode * OpenStax This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3.0 Consider the following
More informationMath 140 Introductory Statistics. First midterm September
Math 140 Introductory Statistics First midterm September 23 2010 Box Plots Graphical display of 5 number summary Q1, Q2 (median), Q3, max, min Outliers If a value is more than 1.5 times the IQR from the
More informationPutting Things Together Part 2
Frequency Putting Things Together Part These exercise blend ideas from various graphs (histograms and boxplots), differing shapes of distributions, and values summarizing the data. Data for, and are in
More informationPercentiles, STATA, Box Plots, Standardizing, and Other Transformations
Percentiles, STATA, Box Plots, Standardizing, and Other Transformations Lecture 3 Reading: Sections 5.7 54 Remember, when you finish a chapter make sure not to miss the last couple of boxes: What Can Go
More informationGetting to know data. Play with data get to know it. Image source: Descriptives & Graphing
Descriptives & Graphing Getting to know data (how to approach data) Lecture 3 Image source: http://commons.wikimedia.org/wiki/file:3d_bar_graph_meeting.jpg Survey Research & Design in Psychology James
More informationMath 227 Elementary Statistics. Bluman 5 th edition
Math 227 Elementary Statistics Bluman 5 th edition CHAPTER 6 The Normal Distribution 2 Objectives Identify distributions as symmetrical or skewed. Identify the properties of the normal distribution. Find
More informationMgtOp 215 TEST 1 (Golden) Spring 2016 Dr. Ahn. Read the following instructions very carefully before you start the test.
MgtOp 15 TEST 1 (Golden) Spring 016 Dr. Ahn Name: ID: Section (Circle one): 4, 5, 6 Read the following instructions very carefully before you start the test. This test is closed book and notes; one summary
More information1. In a statistics class with 136 students, the professor records how much money each
so shows the data collected. student has in his or her possession during the first class of the semester. The histogram 1. In a statistics class with 136 students, the professor records how much money
More informationDiploma in Financial Management with Public Finance
Diploma in Financial Management with Public Finance Cohort: DFM/09/FT Jan Intake Examinations for 2009 Semester II MODULE: STATISTICS FOR FINANCE MODULE CODE: QUAN 1103 Duration: 2 Hours Reading time:
More informationDescriptive Analysis
Descriptive Analysis HERTANTO WAHYU SUBAGIO Univariate Analysis Univariate analysis involves the examination across cases of one variable at a time. There are three major characteristics of a single variable
More information