Lecture 2 Describing Data
|
|
- Alan Price
- 5 years ago
- Views:
Transcription
1 Lecture 2 Describing Data Thais Paiva STA Summer 2013 Term II July 2, 2013
2 Lecture Plan 1 Types of data 2 Describing the data with plots 3 Summary statistics for central tendency and spread 4 Histograms and boxplots
3 Data Recall that A population is composed of units. Units can represent individuals, financial stocks, nations, trees, etc... A sample is a collection of units drawn from the population For each unit we can measure several different characteristics (variables) To present a dataset, we typically use rows to represent units and columns to represent variables.
4 An Example Dataset Salary # of Name Gender ( 1,000) Education Family members AA F 50 B 2 BB M 43 B 3 CC F 65 M 1 DD M 200 B 2 EE M 60 M 4 FF M 25 S 2 GG F 15 S 0 HH F 80 D 3 II M 22 S 1 JJ F 69 B 4 KK F 70 M 2 M = male; F = female B = bachelor; S = senior high; M = masters, D = doctoral
5 Types of Variables Qualitative Variables These values can fall into one of several groups or categories Represent characteristics which cannot be naturally associated to a number Ordinal: although not naturally associated to a number, can be somehow ordered (i.e., education) Nominal (Categorical): there is no natural order (i.e., gender) Quantitative Variables Take numerical values such that using numerical operations makes sense Discrete: only take integer values (i.e., # of family members) Continuous: can take any value (sometimes within a specified range) (i.e., height) Sometimes a variable can be treated as quantitative or qualitative.
6 Sample versus Population Recall that a parameter is a descriptive measure of a population a statistic is a descriptive measure of a sample In many cases, for statistical inference we want the statistics to approximate the unknown parameters. Some approximations will be better than others We will see which are good approximations This lecture, we will focus on some statistics calculated using our sample.
7 Gender According to the data, there are 5 males and 6 females in the sample. A very simple graph is the bar chart representation F M Here the y-axis represents frequency. An alternative is to use percentage. Frequency is simply the number of data values that falls into each category.
8 Education B D M S Bar chart allows us to quickly see the distribution of the variable in our sample. The distribution of a single variable describes the overall pattern of the data. Example: which value is most common? Here we treat education as nominal. We can also re-arrange the x-axis as S B M D and treat education as ordinal.
9 Salary: Frequencies Using bar chart to describe salary is more complicated because it is continuous. One way is to divide data into categories. For example smaller or bigger than 100 (thousands). Frequency Salary This representation that creates categories (bins) for continuous data is called a histogram.
10 Salary: Histograms There are many ways to define the salary categories! By 50K: By 20K: Frequency Salary Frequency Salary Categories can be chosen based on scientific reason or to display certain features of the distribution.
11 Salary: Distribution Density Salary Instead of plotting the number of data point that falls into each bin, we can convert these into relative frequencies. This graph is related to the density of the distribution. More on this later. We simply changed the scale but the general shape of the distribution remains the same.
12 Salary: Distribution The construction of the distribution requires a little bit of attention since it is based on the relative frequencies. Relative frequency = frequency sample size f <100 = # salaries<100 # salaries = f 100 = # salaries 100 # salaries = 1 11 Then the relative frequencies must be distributed evenly along the cells. height <100 = f<100 base <100 = = height 100 = f 100 base 100 = = 0.001
13 Salary: Distribution There is an important property for the histogram built in terms of relative frequencies: The total area of all the rectangles is equal to 1 height <100 base <100 + height 100 base 100 = (100 0) ( ) = 1 This property is worth remembering! It will will be very important later in the course.
14 Describing Distributions It is important to convey certain characteristics of a distribution, for example to compare different distributions. We will focus on three specific areas: 1. Shape 2. Center 3. Spread Do the extreme values tend to be small or large? What values are most common? Where is the middle of the distribution? How much variability is present?
15 Shape: Mode The mode of a variable is the most frequent observation occurring in the data set. Mode(gender)=FEMALE [6(F) and 5(M)] Mode(education)=BACHELOR [4(B), 3(S), 3(M), 1(D)] If you look at the graphs you will realize that the mode coincides with the peaks of a histogram However for salary, because the variable is continuous, we need to group the data first. Again, the mode will depend on the choice of the groups.
16 Modes While the mode is a useful statistic, often a distribution exhibits several peaks. Ex. Unimodal Ex. Bimodal Frequency Frequency
17 Skewness Sometimes a variable is more extreme in one direction, e.g. medical cost, housing price,... Left Skewed Right Skewed Frequency Frequency A symmetric distribution does not exhibit skewness.
18 Measuring the Center: Mean Given that y 1, y 2,..., y n are the n observations within a sample, then the sample mean is defined as Same as the average ȳ = n y i i=1 We will use the bar notation to denote a sample mean Note how each observed value contributes equally to the mean n
19 Example: mean salary If the salary is NOT grouped then the mean is equal to = If salary is grouped (e.g. by 20K), we can use the midpoint of each group: n i n i n = 62.73
20 Measuring the Center: Median The median is a statistic based on the order of the observations. It is a particular case of percentile: the 50th-percentile. Example Median(education)=Bachelor since [SSSBB(B)BMMMD] Median(# family members)=2 since [0112(2)2233] Median(gender)=? It cannot be evaluated because the variable is not ordinal
21 Salary Median If the data are NOT grouped, then the median is 60 Since [ (60) ] If the data are grouped [40-60] n i n i n kp n i 1 3 (6) i=1 kp i=1 n i n (0.54) The median is where the corresponding cumulative frequency is (or the minimum value above) 0.50.
22 Mean versus Median For measuring central tendency: Mean: quantitative data with symmetric distribution Median: quantitative data with skewed distribution Because the median is solely based on the rank of the data, for a unimodal distribution: Shape Left skewed Symmetric Right skewed mean < median mean median mean > median
23 Spread of a Distribution When describing a distribution, knowing of the center may be not sufficient. For example: financial analysts are usually interested in the expected returns (mean) and the risk (spread) In this sense, the spread indicates how much uncertainty (variability) is associated with the expected return Consider two hypothetical stock returns at time 1, 2, 3, 4, 5: 1 2, 1, 0, 1, 2 with mean , 100, 0, 100, 200 with mean 0
24 Measuring Spread: Range The range is simply defined as range = max min Examples: range(salary) = = 185 range(# family members) = 4 0 = 4 range(gender) = not possible range(education) = not possible
25 Measuring Spread: Percentile The p-percentile is the observation within the sample such that p% of the remaining observations are equal or lower than the p-percentile. p The position of the p percentile is 100 (n + 1) the 25% percentile is called 1 st quartile (q 1 ) the 50% percentile is called Median the 75% percentile is called 3 rd quartile (q 3 ) Note the data must be qualitative ordinal or quantitative. Also, the quartiles are also based on the order of the data.
26 Measuring Spread: Percentile Salary Example For salary The original data are: ( ) The sorted data are: ( ) salary [0.25 (11+1)] = salary [3] = 25 > 25th percentile salary [0.5 (11+1)] = salary [6] = 60 > median salary [0.75 (11+1)] = salary [9] = 70 > 75th percentile What if only 10 data were available? A little bit more difficult!
27 Measuring Spread: Percentile Example Suppose you have the following data: y = 1, 3, 6, 8. What are the q 1, median, q 3? According to the percentile formula, the locations are Therefore = = = 3.75 q 1 = 1.5 (or between 1 and 3) median = 4.5 (or between 3 and 6) q 3 = 7.5 (or between 6 and 8)
28 Boxplot The box plot is a useful graphical representation which combines the five number summary: minimum, q 1, median, q 3, maximum Salary # of Family Members
29 Measure Spread: Interquartile Range Compared to a histogram, the boxplot shows a lot less detail. Useful to compare the distribution across different groups The box between the first and third quartile gives the interquartile range (IQR) For salary, IQR = = 45 IQR = q 3 q 1 IQR measures the central spread and is not influenced by the tails - very useful for describing the spread of a skewed distribution! The location of the thick median line in the box also shows the skewness around the center of the distribution
30 Outliers Sometimes very extreme (strange) observations (outliers) can occur and it is important to identify them. Outliers can: Represent recording errors Represent interesting observations Influence summary statistics (e.g. the mean) A popular rule is to suspect an observation if it falls more than 1.5 IQR below the first quartile or above the third quartile.
31 Modified Boxplot Salary Using the 1.5 IQR rule, a modified boxplot shows the potential outliers. The upper and lower limits are (70 25) = (70 25) = Here the 200K value is shown as a point. Excluding the outlier, our new sample mean is 49.9 instead of 63.5K!
32 Measuring Spread: Variance The sample variance, s 2, is computed as the sum of the squared deviations about the mean, x, divided by (n-1). s 2 = n (x i x) 2 i=1 n 1 variance(gender) = not possible!! variance(# family members) = [(2 2.18) 2 + (3 2.18) 2 + (1 2.18) 2 + (2 2.18) 2 + (4 2.18) 2 + (2 2.18) 2 + (0 2.18) 2 +(3 2.18) 2 +(1 2.18) 2 +(4 2.18) 2 +(2 2.18) 2 ]/(11 1) = 1.56
33 Measuring Spread: Standard Deviation The standard deviation, s, is defined as the square root of the variance: Example: sd(gender) = not possible!! sd(# family members) = (1.56) = 1.25
34 Degrees of Freedom The degrees of freedom represent the effective number of values free to vary in the computation a statistic. In computing the variance, we need to calculate the difference d i = (x i x) for each observation. However, x itself is calculated from the n observations!!! In fact, if you give me the difference for any (n 1)-many observations, I can calculate the remaining one! This is because Remember this property! n (x i x) = 0 i=1
35 Putting It All Together A Frequency Frequency B A B C Mean Median Variance S.D C Frequency
36 Putting It All Together a b c
37 Summary Measures of Central Tendency: Mean Median Measures of Spread: Percentiles and quartiles Interquartile range Variance and standard deviation Graphical Representations of Distribution: Histogram Boxplot and modified boxplot Mode and skewness
Lecture 1: Review and Exploratory Data Analysis (EDA)
Lecture 1: Review and Exploratory Data Analysis (EDA) Ani Manichaikul amanicha@jhsph.edu 16 April 2007 1 / 40 Course Information I Office hours For questions and help When? I ll announce this tomorrow
More informationWeek 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.
Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics. Convergent validity: the degree to which results/evidence from different tests/sources, converge on the same conclusion.
More information2 Exploring Univariate Data
2 Exploring Univariate Data A good picture is worth more than a thousand words! Having the data collected we examine them to get a feel for they main messages and any surprising features, before attempting
More informationSTAT 113 Variability
STAT 113 Variability Colin Reimer Dawson Oberlin College September 14, 2017 1 / 48 Outline Last Time: Shape and Center Variability Boxplots and the IQR Variance and Standard Deviaton Transformations 2
More informationDATA SUMMARIZATION AND VISUALIZATION
APPENDIX DATA SUMMARIZATION AND VISUALIZATION PART 1 SUMMARIZATION 1: BUILDING BLOCKS OF DATA ANALYSIS 294 PART 2 PART 3 PART 4 VISUALIZATION: GRAPHS AND TABLES FOR SUMMARIZING AND ORGANIZING DATA 296
More informationSection3-2: Measures of Center
Chapter 3 Section3-: Measures of Center Notation Suppose we are making a series of observations, n of them, to be exact. Then we write x 1, x, x 3,K, x n as the values we observe. Thus n is the total number
More informationLecture Week 4 Inspecting Data: Distributions
Lecture Week 4 Inspecting Data: Distributions Introduction to Research Methods & Statistics 2013 2014 Hemmo Smit So next week No lecture & workgroups But Practice Test on-line (BB) Enter data for your
More informationChapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1
Chapter 3 Numerical Descriptive Measures Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1 Objectives In this chapter, you learn to: Describe the properties of central tendency, variation, and
More informationFrequency Distribution and Summary Statistics
Frequency Distribution and Summary Statistics Dongmei Li Department of Public Health Sciences Office of Public Health Studies University of Hawai i at Mānoa Outline 1. Stemplot 2. Frequency table 3. Summary
More informationDescriptive Statistics
Petra Petrovics Descriptive Statistics 2 nd seminar DESCRIPTIVE STATISTICS Definition: Descriptive statistics is concerned only with collecting and describing data Methods: - statistical tables and graphs
More informationThe Range, the Inter Quartile Range (or IQR), and the Standard Deviation (which we usually denote by a lower case s).
We will look the three common and useful measures of spread. The Range, the Inter Quartile Range (or IQR), and the Standard Deviation (which we usually denote by a lower case s). 1 Ameasure of the center
More informationappstats5.notebook September 07, 2016 Chapter 5
Chapter 5 Describing Distributions Numerically Chapter 5 Objective: Students will be able to use statistics appropriate to the shape of the data distribution to compare of two or more different data sets.
More informationNumerical Descriptions of Data
Numerical Descriptions of Data Measures of Center Mean x = x i n Excel: = average ( ) Weighted mean x = (x i w i ) w i x = data values x i = i th data value w i = weight of the i th data value Median =
More informationMath 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment
Math 2311 Bekki George bekki@math.uh.edu Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment Class webpage: http://www.math.uh.edu/~bekki/math2311.html Math 2311 Class
More informationSome estimates of the height of the podium
Some estimates of the height of the podium 24 36 40 40 40 41 42 44 46 48 50 53 65 98 1 5 number summary Inter quartile range (IQR) range = max min 2 1.5 IQR outlier rule 3 make a boxplot 24 36 40 40 40
More informationNOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS
NOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS A box plot is a pictorial representation of the data and can be used to get a good idea and a clear picture about the distribution of the data. It shows
More informationSection 6-1 : Numerical Summaries
MAT 2377 (Winter 2012) Section 6-1 : Numerical Summaries With a random experiment comes data. In these notes, we learn techniques to describe the data. Data : We will denote the n observations of the random
More information1 Describing Distributions with numbers
1 Describing Distributions with numbers Only for quantitative variables!! 1.1 Describing the center of a data set The mean of a set of numerical observation is the familiar arithmetic average. To write
More informationOverview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution
PSY 464 Advanced Experimental Design Describing and Exploring Data The Normal Distribution 1 Overview/Outline Questions-problems? Exploring/Describing data Organizing/summarizing data Graphical presentations
More informationDescription of Data I
Description of Data I (Summary and Variability measures) Objectives: Able to understand how to summarize the data Able to understand how to measure the variability of the data Able to use and interpret
More informationCenter and Spread. Measures of Center and Spread. Example: Mean. Mean: the balance point 2/22/2009. Describing Distributions with Numbers.
Chapter 3 Section3-: Measures of Center Section 3-3: Measurers of Variation Section 3-4: Measures of Relative Standing Section 3-5: Exploratory Data Analysis Describing Distributions with Numbers The overall
More informationUnit 2 Statistics of One Variable
Unit 2 Statistics of One Variable Day 6 Summarizing Quantitative Data Summarizing Quantitative Data We have discussed how to display quantitative data in a histogram It is useful to be able to describe
More informationDescribing Data: One Quantitative Variable
STAT 250 Dr. Kari Lock Morgan The Big Picture Describing Data: One Quantitative Variable Population Sampling SECTIONS 2.2, 2.3 One quantitative variable (2.2, 2.3) Statistical Inference Sample Descriptive
More informationExample: Histogram for US household incomes from 2015 Table:
1 Example: Histogram for US household incomes from 2015 Table: Income level Relative frequency $0 - $14,999 11.6% $15,000 - $24,999 10.5% $25,000 - $34,999 10% $35,000 - $49,999 12.7% $50,000 - $74,999
More informationStat 101 Exam 1 - Embers Important Formulas and Concepts 1
1 Chapter 1 1.1 Definitions Stat 101 Exam 1 - Embers Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2.
More informationMeasures of Center. Mean. 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) Measure of Center. Notation. Mean
Measure of Center Measures of Center The value at the center or middle of a data set 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) 1 2 Mean Notation The measure of center obtained by adding the values
More information3.1 Measures of Central Tendency
3.1 Measures of Central Tendency n Summation Notation x i or x Sum observation on the variable that appears to the right of the summation symbol. Example 1 Suppose the variable x i is used to represent
More informationIntroduction to Computational Finance and Financial Econometrics Descriptive Statistics
You can t see this text! Introduction to Computational Finance and Financial Econometrics Descriptive Statistics Eric Zivot Summer 2015 Eric Zivot (Copyright 2015) Descriptive Statistics 1 / 28 Outline
More informationBasic Procedure for Histograms
Basic Procedure for Histograms 1. Compute the range of observations (min. & max. value) 2. Choose an initial # of classes (most likely based on the range of values, try and find a number of classes that
More informationMath 2200 Fall 2014, Exam 1 You may use any calculator. You may not use any cheat sheet.
1 Math 2200 Fall 2014, Exam 1 You may use any calculator. You may not use any cheat sheet. Warning to the Reader! If you are a student for whom this document is a historical artifact, be aware that the
More informationPutting Things Together Part 2
Frequency Putting Things Together Part These exercise blend ideas from various graphs (histograms and boxplots), differing shapes of distributions, and values summarizing the data. Data for, and are in
More informationChapter 3. Descriptive Measures. Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 3, Slide 1
Chapter 3 Descriptive Measures Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 3, Slide 1 Chapter 3 Descriptive Measures Mean, Median and Mode Copyright 2016, 2012, 2008 Pearson Education, Inc.
More informationHandout 4 numerical descriptive measures part 2. Example 1. Variance and Standard Deviation for Grouped Data. mf N 535 = = 25
Handout 4 numerical descriptive measures part Calculating Mean for Grouped Data mf Mean for population data: µ mf Mean for sample data: x n where m is the midpoint and f is the frequency of a class. Example
More informationGraphical and Tabular Methods in Descriptive Statistics. Descriptive Statistics
Graphical and Tabular Methods in Descriptive Statistics MATH 3342 Section 1.2 Descriptive Statistics n Graphs and Tables n Numerical Summaries Sections 1.3 and 1.4 1 Why graph data? n The amount of data
More informationExploratory Data Analysis
Exploratory Data Analysis Stemplots (or Stem-and-leaf plots) Stemplot and Boxplot T -- leading digits are called stems T -- final digits are called leaves STAT 74 Descriptive Statistics 2 Example: (number
More informationA LEVEL MATHEMATICS ANSWERS AND MARKSCHEMES SUMMARY STATISTICS AND DIAGRAMS. 1. a) 45 B1 [1] b) 7 th value 37 M1 A1 [2]
1. a) 45 [1] b) 7 th value 37 [] n c) LQ : 4 = 3.5 4 th value so LQ = 5 3 n UQ : 4 = 9.75 10 th value so UQ = 45 IQR = 0 f.t. d) Median is closer to upper quartile Hence negative skew [] Page 1 . a) Orders
More informationSTATISTICAL DISTRIBUTIONS AND THE CALCULATOR
STATISTICAL DISTRIBUTIONS AND THE CALCULATOR 1. Basic data sets a. Measures of Center - Mean ( ): average of all values. Characteristic: non-resistant is affected by skew and outliers. - Median: Either
More informationCategorical. A general name for non-numerical data; the data is separated into categories of some kind.
Chapter 5 Categorical A general name for non-numerical data; the data is separated into categories of some kind. Nominal data Categorical data with no implied order. Eg. Eye colours, favourite TV show,
More informationSTAB22 section 1.3 and Chapter 1 exercises
STAB22 section 1.3 and Chapter 1 exercises 1.101 Go up and down two times the standard deviation from the mean. So 95% of scores will be between 572 (2)(51) = 470 and 572 + (2)(51) = 674. 1.102 Same idea
More information9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives
Basic Statistics for the Healthcare Professional 1 F R A N K C O H E N, M B B, M P A D I R E C T O R O F A N A L Y T I C S D O C T O R S M A N A G E M E N T, LLC Purpose of Statistic 2 Provide a numerical
More informationSome Characteristics of Data
Some Characteristics of Data Not all data is the same, and depending on some characteristics of a particular dataset, there are some limitations as to what can and cannot be done with that data. Some key
More informationSummarising Data. Summarising Data. Examples of Types of Data. Types of Data
Summarising Data Summarising Data Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester Today we will consider Different types of data Appropriate ways to summarise these data 17/10/2017
More informationCHAPTER 2 Describing Data: Numerical
CHAPTER Multiple-Choice Questions 1. A scatter plot can illustrate all of the following except: A) the median of each of the two variables B) the range of each of the two variables C) an indication of
More informationMeasures of Central Tendency Lecture 5 22 February 2006 R. Ryznar
Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar Today s Content Wrap-up from yesterday Frequency Distributions The Mean, Median and Mode Levels of Measurement and Measures of Central
More informationCSC Advanced Scientific Programming, Spring Descriptive Statistics
CSC 223 - Advanced Scientific Programming, Spring 2018 Descriptive Statistics Overview Statistics is the science of collecting, organizing, analyzing, and interpreting data in order to make decisions.
More informationDescriptive Statistics (Devore Chapter One)
Descriptive Statistics (Devore Chapter One) 1016-345-01 Probability and Statistics for Engineers Winter 2010-2011 Contents 0 Perspective 1 1 Pictorial and Tabular Descriptions of Data 2 1.1 Stem-and-Leaf
More informationDot Plot: A graph for displaying a set of data. Each numerical value is represented by a dot placed above a horizontal number line.
Introduction We continue our study of descriptive statistics with measures of dispersion, such as dot plots, stem and leaf displays, quartiles, percentiles, and box plots. Dot plots, a stem-and-leaf display,
More informationSimple Descriptive Statistics
Simple Descriptive Statistics These are ways to summarize a data set quickly and accurately The most common way of describing a variable distribution is in terms of two of its properties: Central tendency
More informationData that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc.
Chapter 8 Measures of Center Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc. Data that can only be integer
More informationSTAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model
STAT 203 - Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model In Chapter 5, we introduced a few measures of center and spread, and discussed how the mean and standard deviation are good
More informationExploring Data and Graphics
Exploring Data and Graphics Rick White Department of Statistics, UBC Graduate Pathways to Success Graduate & Postdoctoral Studies November 13, 2013 Outline Summarizing Data Types of Data Visualizing Data
More informationSOLUTIONS TO THE LAB 1 ASSIGNMENT
SOLUTIONS TO THE LAB 1 ASSIGNMENT Question 1 Excel produces the following histogram of pull strengths for the 100 resistors: 2 20 Histogram of Pull Strengths (lb) Frequency 1 10 0 9 61 63 6 67 69 71 73
More informationSTA 248 H1S Winter 2008 Assignment 1 Solutions
1. (a) Measures of location: STA 248 H1S Winter 2008 Assignment 1 Solutions i. The mean, 100 1=1 x i/100, can be made arbitrarily large if one of the x i are made arbitrarily large since the sample size
More information22.2 Shape, Center, and Spread
Name Class Date 22.2 Shape, Center, and Spread Essential Question: Which measures of center and spread are appropriate for a normal distribution, and which are appropriate for a skewed distribution? Eplore
More informationBoth the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need.
Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need. For exams (MD1, MD2, and Final): You may bring one 8.5 by 11 sheet of
More informationDescriptive Statistics
Chapter 3 Descriptive Statistics Chapter 2 presented graphical techniques for organizing and displaying data. Even though such graphical techniques allow the researcher to make some general observations
More informationNOTES: Chapter 4 Describing Data
NOTES: Chapter 4 Describing Data Intro to Statistics COLYER Spring 2017 Student Name: Page 2 Section 4.1 ~ What is Average? Objective: In this section you will understand the difference between the three
More informationMath 140 Introductory Statistics. First midterm September
Math 140 Introductory Statistics First midterm September 23 2010 Box Plots Graphical display of 5 number summary Q1, Q2 (median), Q3, max, min Outliers If a value is more than 1.5 times the IQR from the
More informationNumerical Descriptive Measures. Measures of Center: Mean and Median
Steve Sawin Statistics Numerical Descriptive Measures Having seen the shape of a distribution by looking at the histogram, the two most obvious questions to ask about the specific distribution is where
More informationChapter 2: Descriptive Statistics. Mean (Arithmetic Mean): Found by adding the data values and dividing the total by the number of data.
-3: Measure of Central Tendency Chapter : Descriptive Statistics The value at the center or middle of a data set. It is a tool for analyzing data. Part 1: Basic concepts of Measures of Center Ex. Data
More informationProf. Thistleton MAT 505 Introduction to Probability Lecture 3
Sections from Text and MIT Video Lecture: Sections 2.1 through 2.5 http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-041-probabilistic-systemsanalysis-and-applied-probability-fall-2010/video-lectures/lecture-1-probability-models-and-axioms/
More informationAP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE
AP STATISTICS Name: FALL SEMESTSER FINAL EXAM STUDY GUIDE Period: *Go over Vocabulary Notecards! *This is not a comprehensive review you still should look over your past notes, homework/practice, Quizzes,
More information1 Exercise One. 1.1 Calculate the mean ROI. Note that the data is not grouped! Below you find the raw data in tabular form:
1 Exercise One Note that the data is not grouped! 1.1 Calculate the mean ROI Below you find the raw data in tabular form: Obs Data 1 18.5 2 18.6 3 17.4 4 12.2 5 19.7 6 5.6 7 7.7 8 9.8 9 19.9 10 9.9 11
More informationChapter 6. y y. Standardizing with z-scores. Standardizing with z-scores (cont.)
Starter Ch. 6: A z-score Analysis Starter Ch. 6 Your Statistics teacher has announced that the lower of your two tests will be dropped. You got a 90 on test 1 and an 85 on test 2. You re all set to drop
More informationChapter 4 Variability
Chapter 4 Variability PowerPoint Lecture Slides Essentials of Statistics for the Behavioral Sciences Seventh Edition by Frederick J Gravetter and Larry B. Wallnau Chapter 4 Learning Outcomes 1 2 3 4 5
More informationIntroduction to Descriptive Statistics
Introduction to Descriptive Statistics 17.871 Types of Variables ~Nominal (Quantitative) Nominal (Qualitative) categorical Ordinal Interval or ratio Describing data Moment Non-mean based measure Center
More informationStatistics I Chapter 2: Analysis of univariate data
Statistics I Chapter 2: Analysis of univariate data Numerical summary Central tendency Location Spread Form mean quartiles range coeff. asymmetry median percentiles interquartile range coeff. kurtosis
More informationBasic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 3: April 25, Abstract
Basic Data Analysis Stephen Turnbull Business Administration and Public Policy Lecture 3: April 25, 2013 Abstract Review summary statistics and measures of location. Discuss the placement exam as an exercise
More information4. DESCRIPTIVE STATISTICS
4. DESCRIPTIVE STATISTICS Descriptive Statistics is a body of techniques for summarizing and presenting the essential information in a data set. Eg: Here are daily high temperatures for Jan 16, 2009 in
More informationFundamentals of Statistics
CHAPTER 4 Fundamentals of Statistics Expected Outcomes Know the difference between a variable and an attribute. Perform mathematical calculations to the correct number of significant figures. Construct
More informationSTAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model
STAT 203 - Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model In Chapter 5, we introduced a few measures of center and spread, and discussed how the mean and standard deviation are good
More informationMEASURES OF CENTRAL TENDENCY & VARIABILITY + NORMAL DISTRIBUTION
MEASURES OF CENTRAL TENDENCY & VARIABILITY + NORMAL DISTRIBUTION 1 Day 3 Summer 2017.07.31 DISTRIBUTION Symmetry Modality 单峰, 双峰 Skewness 正偏或负偏 Kurtosis 2 3 CHAPTER 4 Measures of Central Tendency 集中趋势
More informationMAS1403. Quantitative Methods for Business Management. Semester 1, Module leader: Dr. David Walshaw
MAS1403 Quantitative Methods for Business Management Semester 1, 2018 2019 Module leader: Dr. David Walshaw Additional lecturers: Dr. James Waldren and Dr. Stuart Hall Announcements: Written assignment
More informationData Distributions and Normality
Data Distributions and Normality Definition (Non)Parametric Parametric statistics assume that data come from a normal distribution, and make inferences about parameters of that distribution. These statistical
More informationDescriptive Analysis
Descriptive Analysis HERTANTO WAHYU SUBAGIO Univariate Analysis Univariate analysis involves the examination across cases of one variable at a time. There are three major characteristics of a single variable
More informationKING FAHD UNIVERSITY OF PETROLEUM & MINERALS DEPARTMENT OF MATHEMATICAL SCIENCES DHAHRAN, SAUDI ARABIA. Name: ID# Section
KING FAHD UNIVERSITY OF PETROLEUM & MINERALS DEPARTMENT OF MATHEMATICAL SCIENCES DHAHRAN, SAUDI ARABIA STAT 11: BUSINESS STATISTICS I Semester 04 Major Exam #1 Sunday March 7, 005 Please circle your instructor
More informationThe Normal Distribution
Stat 6 Introduction to Business Statistics I Spring 009 Professor: Dr. Petrutza Caragea Section A Tuesdays and Thursdays 9:300:50 a.m. Chapter, Section.3 The Normal Distribution Density Curves So far we
More informationMBEJ 1023 Dr. Mehdi Moeinaddini Dept. of Urban & Regional Planning Faculty of Built Environment
MBEJ 1023 Planning Analytical Methods Dr. Mehdi Moeinaddini Dept. of Urban & Regional Planning Faculty of Built Environment Contents What is statistics? Population and Sample Descriptive Statistics Inferential
More informationThe normal distribution is a theoretical model derived mathematically and not empirically.
Sociology 541 The Normal Distribution Probability and An Introduction to Inferential Statistics Normal Approximation The normal distribution is a theoretical model derived mathematically and not empirically.
More information2011 Pearson Education, Inc
Statistics for Business and Economics Chapter 4 Random Variables & Probability Distributions Content 1. Two Types of Random Variables 2. Probability Distributions for Discrete Random Variables 3. The Binomial
More informationMath Take Home Quiz on Chapter 2
Math 116 - Take Home Quiz on Chapter 2 Show the calculations that lead to the answer. Due date: Tuesday June 6th Name Time your class meets Provide an appropriate response. 1) A newspaper surveyed its
More informationPercentiles, STATA, Box Plots, Standardizing, and Other Transformations
Percentiles, STATA, Box Plots, Standardizing, and Other Transformations Lecture 3 Reading: Sections 5.7 54 Remember, when you finish a chapter make sure not to miss the last couple of boxes: What Can Go
More informationMATHEMATICS APPLIED TO BIOLOGICAL SCIENCES MVE PA 07. LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1)
LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1) Descriptive statistics are ways of summarizing large sets of quantitative (numerical) information. The best way to reduce a set of
More informationNCSS Statistical Software. Reference Intervals
Chapter 586 Introduction A reference interval contains the middle 95% of measurements of a substance from a healthy population. It is a type of prediction interval. This procedure calculates one-, and
More informationChapter 15: Sampling distributions
=true true Chapter 15: Sampling distributions Objective (1) Get "big picture" view on drawing inferences from statistical studies. (2) Understand the concept of sampling distributions & sampling variability.
More informationDescriptive Statistics Bios 662
Descriptive Statistics Bios 662 Michael G. Hudgens, Ph.D. mhudgens@bios.unc.edu http://www.bios.unc.edu/ mhudgens 2008-08-19 08:51 BIOS 662 1 Descriptive Statistics Descriptive Statistics Types of variables
More informationRandom Variables and Probability Distributions
Chapter 3 Random Variables and Probability Distributions Chapter Three Random Variables and Probability Distributions 3. Introduction An event is defined as the possible outcome of an experiment. In engineering
More informationDATA HANDLING Five-Number Summary
DATA HANDLING Five-Number Summary The five-number summary consists of the minimum and maximum values, the median, and the upper and lower quartiles. The minimum and the maximum are the smallest and greatest
More informationChapter 3: Displaying and Describing Quantitative Data Quiz A Name
Chapter 3: Displaying and Describing Quantitative Data Quiz A Name 3.1.1 Find summary statistics; create displays; describe distributions; determine 1. Following is a histogram of salaries (in $) for a
More informationAverages and Variability. Aplia (week 3 Measures of Central Tendency) Measures of central tendency (averages)
Chapter 4 Averages and Variability Aplia (week 3 Measures of Central Tendency) Chapter 5 (omit 5.2, 5.6, 5.8, 5.9) Aplia (week 4 Measures of Variability) Measures of central tendency (averages) Measures
More informationPSYCHOLOGICAL STATISTICS
UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc COUNSELLING PSYCHOLOGY (2011 Admission Onwards) II Semester Complementary Course PSYCHOLOGICAL STATISTICS QUESTION BANK 1. The process of grouping
More informationWk 2 Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12)
Wk 2 Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics: - Measures of centrality (Mean, median, mode, trimmed mean) - Measures of spread (MAD, Standard deviation, variance) -
More informationThe Normal Distribution & Descriptive Statistics. Kin 304W Week 2: Jan 15, 2012
The Normal Distribution & Descriptive Statistics Kin 304W Week 2: Jan 15, 2012 1 Questionnaire Results I received 71 completed questionnaires. Thank you! Are you nervous about scientific writing? You re
More informationMath146 - Chapter 3 Handouts. The Greek Alphabet. Source: Page 1 of 39
Source: www.mathwords.com The Greek Alphabet Page 1 of 39 Some Miscellaneous Tips on Calculations Examples: Round to the nearest thousandth 0.92431 0.75693 CAUTION! Do not truncate numbers! Example: 1
More informationMini-Lecture 3.1 Measures of Central Tendency
Mini-Lecture 3.1 Measures of Central Tendency Objectives 1. Determine the arithmetic mean of a variable from raw data 2. Determine the median of a variable from raw data 3. Explain what it means for a
More informationChapter 3 Descriptive Statistics: Numerical Measures Part A
Slides Prepared by JOHN S. LOUCKS St. Edward s University Slide 1 Chapter 3 Descriptive Statistics: Numerical Measures Part A Measures of Location Measures of Variability Slide Measures of Location Mean
More informationMEASURES OF DISPERSION, RELATIVE STANDING AND SHAPE. Dr. Bijaya Bhusan Nanda,
MEASURES OF DISPERSION, RELATIVE STANDING AND SHAPE Dr. Bijaya Bhusan Nanda, CONTENTS What is measures of dispersion? Why measures of dispersion? How measures of dispersions are calculated? Range Quartile
More informationLecture Data Science
Web Science & Technologies University of Koblenz Landau, Germany Lecture Data Science Statistics Foundations JProf. Dr. Claudia Wagner Learning Goals How to describe sample data? What is mode/median/mean?
More informationIOP 201-Q (Industrial Psychological Research) Tutorial 5
IOP 201-Q (Industrial Psychological Research) Tutorial 5 TRUE/FALSE [1 point each] Indicate whether the sentence or statement is true or false. 1. To establish a cause-and-effect relation between two variables,
More information1. In a statistics class with 136 students, the professor records how much money each
so shows the data collected. student has in his or her possession during the first class of the semester. The histogram 1. In a statistics class with 136 students, the professor records how much money
More information