9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives

Size: px
Start display at page:

Download "9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives"

Transcription

1 Basic Statistics for the Healthcare Professional 1 F R A N K C O H E N, M B B, M P A D I R E C T O R O F A N A L Y T I C S D O C T O R S M A N A G E M E N T, LLC Purpose of Statistic 2 Provide a numerical summary of the data being analyzed. Data (n) Factual information organized for analysis. Numerical or other information represented in a form suitable for processing by computer Values from scientific experiments. Provide the basis for making inferences about the future. Provide the foundation for assessing process capability. Provide a common language to be used throughout an organization to describe processes. Relax.it won t be that bad! Objectives Identify the basic tenets of statistics and statistical theory Define mean, median and other central measurements Define standard deviation, inter-quartile ratios and other measurements of variability Explain the difference between data analysis and statistics Describe hypothesis testing and other tests of statistical significance Articulate how to build relationships through regression analysis 3 1

2 3 Degrees of Separation Measures of Location (central tendency) Mean, Median and Mode 1st statistical moment Measures of Variation (dispersion) Range, Standard deviation, Interquartile Range 2nd statistical moment Measures of Error (estimation) Standard error and confidence intervals Measures of Relationships Covariance, Correlation, Regression 4 Descriptive Statistics Descriptive statistics are numbers that are used to summarize and describe data Mean conversion factor Cost per RVU Average Collection by provider Work RVUs that define 1 FTE New office visits to initial consults Measures of central tendency include the mean, median, and mode Measures of variability include the range, variance, and standard deviation. Descriptive statistics are just descriptive and do not involve generalizing beyond the data at hand 5 Inferential Statistics Inferential statistics depends upon a sample of a population to draw (or infer) conclusions about the population as a whole Inferences are made based on central tendency or any of a number of other aspects of a distribution For example, it is not practical to review every chart for a practice so extrapolation is used to assess overpayment No given sample will represent exactly the population, so distribution techniques and sample error calculations are very important 6 2

3 Important Definitions Universe The complete set of objects included within the database in question Sample Frame A homogenous set of objects that the investigator is interested in studying Sample A subset of the population that is actually being studied Variable A characteristic of an individual or object that can have different values (as opposed to a constant) Independent variable The variable that is systematically manipulated or measured by the investigator to determine its impact on the outcome. Important Definitions Dependent variable The outcome variable of interest Data The measurements that are collected by the investigator Statistic Summary measure of a sample Parameter Summary measure of a population Attribute Data (Qualitative) Types of Data Is always binary, there are only two possible values (0, 1) Yes, No Go, No go Pass/Fail Variable Data (Quantitative) Variables are properties or characteristics of some event, object, or person that can take on different values or amounts (as opposed to constants such as π that do not vary). Independent Variables Variables that are manipulated by the experimenter Dependent Variables A variable that measures the experimental outcome 9 3

4 Discrete Variables 10 Discrete variables are whole numbers (count numbers) that do not pass through the space between each number. The number of patients seen in a single day The number of surgical procedures reported by a provider The number of codes reported for the practice The number of different specialties The number of charts with coding errors Continuous Variables Continuous variables are real numbers that can occupy and infinite amount of space between discreet variables Frequency distribution of E/M codes within a category Total number of calculated FTEs in a practice Minutes per work RVU Ratio of initial consults to new office visits Cost per RVU Charge per Hour 11 Frequency Distribution 12 W E L C O M E T O T H E F A M I L Y! 4

5 Frequency Distributions A frequency distribution shows the number of observations falling into each of several ranges of values Frequency distributions are portrayed as frequency tables, histograms, or polygons Frequency distributions can show either the actual number of observations falling in each range or the percentage of observations From the frequency distribution table, you can calculate the mean, median, mode, and range 13 Normal Distribution In many natural processes, random variation conforms to the normal distribution Characteristics Symmetric, Unimodal, Extends to +/- infinity Completely described by two parameters Mean and Standard deviation % of the data will fall within +/- 1 standard deviation % of the data will fall within +/- 2 standard deviations % of the data will fall within +/- 3 standard deviations 14 Normal Distribution - Illustrated

6 Properties of the Normal Distribution 1. It is bell-shaped 2. The mean, median and mode are equal and located in the center of the distribution 3. It is unimodal (has only one mode) 4. The curve is symmetric about the mean 5. The curve is continuous (for each value of x there is a corresponding value of y) 6. The curve never touches the x-axis (goes to infinity) 7. The total area under the curve is The area under the curve that lies 1, 2 and 3 STD is equivalent to 68%, 95% and 99.7% respectively 16 Same Mean, Different Standard Deviations 17 Different Mean, Same Standard Deviation 18 6

7 Different Mean, Different Standard Deviation 19 Skew Looking at the Curve Skew measures the degree to which the distribution is biased (or skewed) right or left of normal expectation Kurtosis (it s not a disease) Beyond skewness, kurtosis tells us when our distribution may have high or low variance, even if normal. 20 Skewness the 3 rd Moment 21 The third moment, is used to define the skewness of a distribution: Skewness is a measure of the symmetry of the shape of a distribution. If a distribution is symmetric, the skewness will be zero. If there is a long tail in the positive direction, skewness will be positive, while if there is a long tail in the negative direction, skewness will be negative. 7

8 Percent Distribution Mean Median Anderson-Darling Normality Test A-Squared 0.13 P-Value Mean StDev Variance Skewness Kurtosis N Minimum st Quartile Median rd Quartile Maximum % Confidence Interval for Mean % Confidence Interval for Median % Confidence Interval for StDev Mean Median Individual standard deviations are used to calculate the intervals Anderson-Darling Normality Test A-Squared P-Value <0.005 Mean StDev Variance Skewness Kurtosis N 2507 Minimum st Quartile Median rd Quartile Maximum % Confidence Interval for Mean % Confidence Interval for Median % Confidence Interval for StDev /17/2015 Kurtosis The 4 th Moment The fourth moment, is used to define the kurtosis of a 22 distribution: Kurtosis is a measure of the flatness or peakedness of a distribution. Flat-looking distributions are referred to as platykurtic, while peaked distributions are referred to as leptokurtic. A kurtosis value of 3 represents a normally distributed data set. <3 approaches platykurtic while > 3 approaches leptokurtic. Normal or Not? 23 Normal Not Normal Summary Report for Data Summary Report for Frequency 95% Confidence Intervals 95% Confidence Intervals Benford s Distribution Benford's Distribution The distribution of first digits in any series of naturally occurring numbers, according to Benford's law. Each bar represents a digit, and the height of the bar is the percentage of numbers that start with that digit First Digit

9 Percent Distribution 9/17/2015 Even We Are Obliged to Follow Benford 25 Benford's Distribution Benford Powers of 2 Area Population First Digit Weighted average charges for 11,066 individual procedure codes Taken from the 100% Medicare claims database Measures of Position and Central Tendency 26 M E A N, M E D I A N, M O D E, P E R C E N T I L E S Measures of Central Tendency In the study of statistics there are many measurements of central tendency. The three most common are the mean, median, and mode. These metrics are used to identify the approximate location of the center of the data 9

10 Mean The mean, or arithmetic average, is found by adding a group of numbers and dividing the sum by the number of items added. The mean is the best known and most used measure of central tendency. The group of numbers is sometimes referred to as the data or data set. The mean measures the central location of the values within the database The mean is useful for predicting but only where there are no extreme values The Mean (or Average) 29 Arithmetic Mean (average) Create a metric for each code using the same method i.e., divide the charge by the RVU Add each of the results together to get a grand total Divide the grand total by the number of samples Pros: Easy to calculate Eliminates frequency bias Cons: Does not take into account the frequency of occurrence Not accurate if data is not normally distributed 30 xw xi w i=1 = n wi i1 Code CF Total Count Average n i 10

11 Mean Sensitivity to Outliers 31 CF Without Outliers 77.6 CF With Outliers 89.8 Median The median is the middle number in a set of data that is arranged in either ascending or descending order. One-half of the numbers will be on either side of the median. The median is good for use with non-normally distributed data as it is far less affected by outliers Order the data in ascending order Count the number of records and divide by two Pick the middle number If an even number of records, get the average of the middle two Example of Median Calculation 33 Odd Number of Records Even Number of Records Code Fee Frequency RVU CF , , The median is Code Fee Frequency RVU CF , , , The median is halfway between and ( ) / 2 =

12 Median is Less Sensitive to Outliers 34 CF Without Outliers 80.0 CF With Outliers 81.0 Mean vs. Median 35 This table shows that the median wage is substantially less than the average wage. The reason for the difference is that the distribution of workers by wage level is highly skewed. 0.01% = $23,846, % = $2,802, % = 1,019, % = 161,139 Mode The mode is the one value within a data set that is reported the most There can be more than one mode A single mode is called Unimodal Two modes is called Bimodal Many modes is called MultiModal A multimodal distribution can indicate groups of variables with difference characteristics within the same data set Paid amounts for E&M visit vs. Surgical procedures 36 12

13 Mean Median /17/2015 UniModal 37 Histogram of Procedure Code Frequency Procedure Code Bimodal 38 Summary Report for Mean Universe Anderson-Darling Normality Test A-Squared P-Value <0.005 Mean StDev Variance Skewness Kurtosis N 1014 Minimum st Quartile Median rd Quartile Maximum % Confidence Interval for Mean % Confidence Interval for Median % Confidence Interval for StDev % Confidence Intervals Multimodal 39 Histogram of Payment Frequency Payment

14 Percentiles th The p percentile is the n*p 100 Percentiles report the value for a given variable where a certain percent of observations fall above and below the value For example: At the 20 th percentile, some 20% of values are lower and some 80% of value are higher A percentile is 1/100 of the total of an ordered data set Splits the data into hundredths All data are ordered around the median (50 th percentile) th 40 observation, when the set of observations are arranged in order or magnitude; where n is the sample size. Other Standard Metrics Quartiles divide the data into four equal parts 1 st quartile = 25 th percentile 2 nd quartile = 50 th percentile (mean) 3 rd quartile = 75 th percentile Deciles divide the data into 10 equal parts 41 Quartiles and Deciles 42 Quartiles Q L i i h f i i n F, i 1, 2, 3 4 L 0 = Lower limit of the i-th Quartile class n = Total number of observations in the distribution h = Class width of the i-th Quartile class f i = Frequency of the i-th Quartile class F = Cumulative frequency of the class prior to the i-th quartile class Deciles P L i i h f i i n F, i 1, 2, 3 10 L 0 = Lower limit of the i-th Decile class n = Total number of observations in the distribution h = Class width of the i-th Decile class f i = Frequency of the i-th Decile class F = Cumulative frequency of the class prior to the i-th Decile class 14

15 Interquartile Range The interquartile range (IQR) is a measure of variability, based on dividing a data set into quartiles. The IQR identifies the middle 50% of the data 43 th The p percentile is the Example: Age Distribution n*p 100 th observation, when the set of observations are arranged in order or magnitude; where n is the sample size. For the age distribution, n = 121; The 75 th percentile for the age distribution is the (121 *75)/100 = ~ 91 st observation when the ages are arranged in an increasing order of magnitude. The 75 th percentile of the ages is therefore 31 years; the 25 th percentile, 50 th and 80 th percentile are the 31 st, 61 st, and 97 th observations respectively, as shown on the next slide. 25 th percentile 50 th percentile 75 th percentile 80 th percentile Age Frequency Cumulative Frequency Total 121 The 31 st observation falls in this group The 61 st observation falls in this group The 97 th observation falls in this group

16 Mean Median Anderson-Darling Normality Test A-Squared P-Value <0.005 Mean StDev Variance Skewness Kurtosis N 1632 Minimum st Quartile Median rd Quartile Maximum % Confidence Interval for Mean % Confidence Interval for Median % Confidence Interval for StDev /17/2015 Box Plot Quantitative Scale 75 th percentile IQR Referred to as whisker 75 th percentile Average/mean 50 th percentile/median 25 th percentile Referred to as whisker 25 th percentile IQR Individual box symbol IQR: Interquartile range, which is calculated by substracting the 25 th percentile of the data from 75 th percentile; consequently, it contains the middle 50% of the observations Box Plots 47 Other Ways to Graph Mean, Median and Mode 48 Summary Report for Fee Dotplot of Fee for % Confidence Intervals Fee Each symbol represents up to 8 observations. 16

17 Measures of Variability 49 Variance A measure of the average squared distance of possible values from the expected value (arithmetic average) 50 Code Fee Frequency RVU CF Difference Variance S = (4.986) , , (12.366) , (10.212) (5.070) Note that the difference (not squared) always adds up to zero n i=1 2 ( xi-x ) n - 1 Standard Deviation S = 51 n 2 ( xi-x ) i=1 n - 1 The measure of spread of values around the mean Code Fee Frequency RVU CF Difference Variance (4.986) , , (12.366) , (10.212) (5.070) The square root of the variance ( ) equals the standard deviation (29.643) For normally distributed data: 68.2% of the population values will fall between and % of population values will fall between and

18 Coefficient of Variance The coefficient of variance (CV) is calculated by dividing the standard deviation by the mean The coefficient of variation is a measure of spread that describes the amount of variability relative to the mean Because the coefficient of variation is unitless, you can use it instead of the standard deviation to compare the spread of data sets that have different units or different means. Large CV means that the data are more dispersed while a lower CV means that the data 52 CV for Two Different Codes Specialty Fee Stdev CV GS % CA % FP % GE % IM % OS % PM % Specialty Fee Stdev CV GS % CA % FP % GE % IM % OS % PM % Interquartile Range 54 The interquartile range (IQR) is the distance between the 75 th percentile and the 25 th percentile The IQR is essentially the range of the middle 50% of the data Because it uses the middle 50%, the IQR is not affected by outliers or extreme values IQR is often represented using Boxplots 18

19 Box Plots 55 Error and Confidence Intervals 56 x t (1-2 )(n -1)df S n pˆ Z (1-2 ) SE pˆ where SE = pˆ (1 - pˆ ) n What is Sample Error 57 In statistics, sampling error is the error caused by observing a sample instead of the whole population. [Bunns & Grove, 2009] The sample is never identical to the population Basically, all samples have some error when used to predict a value within the population 19

20 Causes of Sampling Error Population specific error Not understanding who or what to sample Sample frame error Occurs when the wrong sub-population is identified and/or used Selection error When data points are not selected correctly Non-response error Occurs when data are missing, variable fields are zero or other similar issues Sampling error Can occur when the wrong sample type is selected (e.g. SVRS, Cluster, Convenience) 58 Calculating the Margin of Error Margin of error rules go something like this: The larger the sample, the smaller the error The smaller the variance, the smaller the error SE can be calculated using two primary assessment types Attribute Variable Attribute 59 Variable Example of SE for Variable Assessment A sample of average charges for was taken from 50 practices in a given area Mean = $82.40 and STDev = $15.55 Assume normal distribution 68.26% of values between $66.85 and $97.95 SE = Stdev/sqrt(N), or 15.55/sqrt(50), or 15.55/7.07 = 2.2 The standard error for our estimate of the mean of $82.40 is $

21 Example for SE Calculation for Attribute Assessment In a chart review, a practice finds that $1,809 out of $5,742 was paid in error This equates to a paid error rate of 31.5% To calculate the sample error, we use this formula: Where p =.315 (1-p) = =.685 n = 5,742 (p(1-p)/n =.216 / 5742 = % The square root of % is 0.613% Plus and minus p = 30.88% to % 61 Example for SE Calculation for Attribute Assessment In a chart review, a practice finds that 6 out of 30 charts contained a medical necessity error This equates to a coding error rate of 20% To calculate the sample error, we use this formula: Where p =.2 (1-p) = =.8 n = 30 p(1-p)/n =.16 / 30 = The square root of is (7.3%) Plus and minus p = 12.7% to 27.3% 62 Common Z levels of confidence Z-score is a measure of standard distance in a normally distributed distribution Many believe that if the data set is large (n > 30), you can assume a normal or near normal distribution (NOT TRUE) Commonly used confidence levels are 90%, 95%, and 99% 21

22 Margin of Error (1/2 Interval) The margin of error is the z or t score times the standard error z and t values depend on how wide or narrow you want the margin of error to be The higher the value, the higher the margin of error Sample of 50, 95% margin of error Mean = 82.40, stdev = 15.55, SE = 2.20 Margin of error = (z or t) times SE z = 1.96*2.20 = 4.31 t = * 2.20 = What is a Confidence Interval (CI)? The purpose of a confidence interval is to validate a point estimate; it tells us how far off our estimate is likely to be A confidence interval specifies a range of values within which the unknown population parameter may lie Normal CI values are 90, 95%, 99% and 99.9% The width of the interval gives us some idea as to how uncertain we are about an estimate A very wide interval may indicate that more data should be collected before anything very definite can be inferred from the data This means when a sample is drawn there are?? chances in 100 that the sample will reflect the sampling frame at large within the sampling error 65 Using our average charge example: Interpreting the CI Mean = 82.40, SD= 15.55, SE = 2.20, ME = 4.42 (t-score) CI = /- 4.42, or 95% CI = $77.98 to $86.82 False Statement: I am 95% confident that the true average charge for this code is somewhere between $77.98 and $86.82 True Statement: In 95% of samples, the population mean will be somewhere between $77.98 and $

23 Example for SE Calculation for Attribute Assessment In a chart review, a practice finds that $1,809 out of $5,742 was paid in error This equates to a paid error rate of 31.5% To calculate the sample error, we use this formula: Where p =.315 (1-p) = =.685 n = 5,742 (p(1-p)/n =.216 / 5742 = % The square root of % is 0.613% Plus and minus p = 30.88% to % 67 Example for SE Calculation for Attribute Assessment In a chart review, a practice finds that 6 out of 30 charts contained a medical necessity error This equates to a coding error rate of 20% To calculate the sample error, we use this formula: Where p =.2 (1-p) = =.8 n = 30 p(1-p)/n =.16 / 30 = The square root of is (7.3%) Plus and minus p = 12.7% to 27.3% 68 Attribute 95% Confidence Interval 69 In attribute example 1, the SE was (.613%) The 95% half interval is * 1.96 = (1.2%) Plus and minus the p of 31.5% = 30.3% to 32.7% In attribute example 2, the SE was (7.3%) The 95% half interval is.073 * 1.96 = (14.3%) Plus and minus the p of 20% = 5.7% to 34.3% The confidence interval range has a huge impact in inferential statistics 23

24 95% Confidence Intervals To Change the Confidence Interval 71 To narrow the confidence interval Decrease the variability Lower your z/t score Increase the sample size To get a better level of confidence Decrease the variability Accept a broader CI (i.e., 80%) Increase the Sample Size x Z x Z s n s n CI Applications Physician Productivity In physician productivity studies, the 95% CI gives us a range of work RVU values. In 95% of the samples we take, the true mean for the population would fall somewhere between the lower and upper bound 72 Lower Work Mean Work Upper Work RVUs per RVUs per RVUs per Specialty FTE -National FTE -National FTE -National GP 4, , , EM 6, , , PD 4, , , CV 4, , , PY 3, , , GS 4, , , FP 4, , , OS 4, , ,

25 Mean Median Anderson-Darling Normality Test A-Squared 0.69 P-Value Mean StDev Variance Skewness Kurtosis N 40 Minimum st Quartile Median rd Quartile Maximum % Confidence Interval for Mean % Confidence Interval for Median % Confidence Interval for StDev /17/2015 CI Applications Auditing Summary Report for Overpaid 95% Confidence Intervals 73 If a hundred similar audits were performed, in 95 of them, the actual mean damage would be somewhere between $ and $ Assume the universe is 10,000 claims The difference between the mean and the lower bound of the 95% CI is $28.71 This translates to a difference in damage estimates of 107,020 rather than 135,730 (28,710) Frank D. Cohen fcohen@drsmgmt.com For More Information To Get the Toolbox Click on the Library tab (at top of page) Click on Workshop Toolboxes Select the Statistics toolbox Password is

Simple Descriptive Statistics

Simple Descriptive Statistics Simple Descriptive Statistics These are ways to summarize a data set quickly and accurately The most common way of describing a variable distribution is in terms of two of its properties: Central tendency

More information

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1 Chapter 3 Numerical Descriptive Measures Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1 Objectives In this chapter, you learn to: Describe the properties of central tendency, variation, and

More information

DATA SUMMARIZATION AND VISUALIZATION

DATA SUMMARIZATION AND VISUALIZATION APPENDIX DATA SUMMARIZATION AND VISUALIZATION PART 1 SUMMARIZATION 1: BUILDING BLOCKS OF DATA ANALYSIS 294 PART 2 PART 3 PART 4 VISUALIZATION: GRAPHS AND TABLES FOR SUMMARIZING AND ORGANIZING DATA 296

More information

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics. Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics. Convergent validity: the degree to which results/evidence from different tests/sources, converge on the same conclusion.

More information

1 Exercise One. 1.1 Calculate the mean ROI. Note that the data is not grouped! Below you find the raw data in tabular form:

1 Exercise One. 1.1 Calculate the mean ROI. Note that the data is not grouped! Below you find the raw data in tabular form: 1 Exercise One Note that the data is not grouped! 1.1 Calculate the mean ROI Below you find the raw data in tabular form: Obs Data 1 18.5 2 18.6 3 17.4 4 12.2 5 19.7 6 5.6 7 7.7 8 9.8 9 19.9 10 9.9 11

More information

Frequency Distribution and Summary Statistics

Frequency Distribution and Summary Statistics Frequency Distribution and Summary Statistics Dongmei Li Department of Public Health Sciences Office of Public Health Studies University of Hawai i at Mānoa Outline 1. Stemplot 2. Frequency table 3. Summary

More information

Descriptive Statistics

Descriptive Statistics Petra Petrovics Descriptive Statistics 2 nd seminar DESCRIPTIVE STATISTICS Definition: Descriptive statistics is concerned only with collecting and describing data Methods: - statistical tables and graphs

More information

Numerical Descriptions of Data

Numerical Descriptions of Data Numerical Descriptions of Data Measures of Center Mean x = x i n Excel: = average ( ) Weighted mean x = (x i w i ) w i x = data values x i = i th data value w i = weight of the i th data value Median =

More information

Measures of Dispersion (Range, standard deviation, standard error) Introduction

Measures of Dispersion (Range, standard deviation, standard error) Introduction Measures of Dispersion (Range, standard deviation, standard error) Introduction We have already learnt that frequency distribution table gives a rough idea of the distribution of the variables in a sample

More information

Descriptive Analysis

Descriptive Analysis Descriptive Analysis HERTANTO WAHYU SUBAGIO Univariate Analysis Univariate analysis involves the examination across cases of one variable at a time. There are three major characteristics of a single variable

More information

2 Exploring Univariate Data

2 Exploring Univariate Data 2 Exploring Univariate Data A good picture is worth more than a thousand words! Having the data collected we examine them to get a feel for they main messages and any surprising features, before attempting

More information

Basic Procedure for Histograms

Basic Procedure for Histograms Basic Procedure for Histograms 1. Compute the range of observations (min. & max. value) 2. Choose an initial # of classes (most likely based on the range of values, try and find a number of classes that

More information

Description of Data I

Description of Data I Description of Data I (Summary and Variability measures) Objectives: Able to understand how to summarize the data Able to understand how to measure the variability of the data Able to use and interpret

More information

Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution

Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution PSY 464 Advanced Experimental Design Describing and Exploring Data The Normal Distribution 1 Overview/Outline Questions-problems? Exploring/Describing data Organizing/summarizing data Graphical presentations

More information

Some Characteristics of Data

Some Characteristics of Data Some Characteristics of Data Not all data is the same, and depending on some characteristics of a particular dataset, there are some limitations as to what can and cannot be done with that data. Some key

More information

DESCRIPTIVE STATISTICS II. Sorana D. Bolboacă

DESCRIPTIVE STATISTICS II. Sorana D. Bolboacă DESCRIPTIVE STATISTICS II Sorana D. Bolboacă OUTLINE Measures of centrality Measures of spread Measures of symmetry Measures of localization Mainly applied on quantitative variables 2 DESCRIPTIVE STATISTICS

More information

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment Math 2311 Bekki George bekki@math.uh.edu Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment Class webpage: http://www.math.uh.edu/~bekki/math2311.html Math 2311 Class

More information

4. DESCRIPTIVE STATISTICS

4. DESCRIPTIVE STATISTICS 4. DESCRIPTIVE STATISTICS Descriptive Statistics is a body of techniques for summarizing and presenting the essential information in a data set. Eg: Here are daily high temperatures for Jan 16, 2009 in

More information

Module Tag PSY_P2_M 7. PAPER No.2: QUANTITATIVE METHODS MODULE No.7: NORMAL DISTRIBUTION

Module Tag PSY_P2_M 7. PAPER No.2: QUANTITATIVE METHODS MODULE No.7: NORMAL DISTRIBUTION Subject Paper No and Title Module No and Title Paper No.2: QUANTITATIVE METHODS Module No.7: NORMAL DISTRIBUTION Module Tag PSY_P2_M 7 TABLE OF CONTENTS 1. Learning Outcomes 2. Introduction 3. Properties

More information

CHAPTER 2 Describing Data: Numerical

CHAPTER 2 Describing Data: Numerical CHAPTER Multiple-Choice Questions 1. A scatter plot can illustrate all of the following except: A) the median of each of the two variables B) the range of each of the two variables C) an indication of

More information

3.1 Measures of Central Tendency

3.1 Measures of Central Tendency 3.1 Measures of Central Tendency n Summation Notation x i or x Sum observation on the variable that appears to the right of the summation symbol. Example 1 Suppose the variable x i is used to represent

More information

PSYCHOLOGICAL STATISTICS

PSYCHOLOGICAL STATISTICS UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc COUNSELLING PSYCHOLOGY (2011 Admission Onwards) II Semester Complementary Course PSYCHOLOGICAL STATISTICS QUESTION BANK 1. The process of grouping

More information

appstats5.notebook September 07, 2016 Chapter 5

appstats5.notebook September 07, 2016 Chapter 5 Chapter 5 Describing Distributions Numerically Chapter 5 Objective: Students will be able to use statistics appropriate to the shape of the data distribution to compare of two or more different data sets.

More information

Measures of Central tendency

Measures of Central tendency Elementary Statistics Measures of Central tendency By Prof. Mirza Manzoor Ahmad In statistics, a central tendency (or, more commonly, a measure of central tendency) is a central or typical value for a

More information

Lecture 2 Describing Data

Lecture 2 Describing Data Lecture 2 Describing Data Thais Paiva STA 111 - Summer 2013 Term II July 2, 2013 Lecture Plan 1 Types of data 2 Describing the data with plots 3 Summary statistics for central tendency and spread 4 Histograms

More information

MEASURES OF DISPERSION, RELATIVE STANDING AND SHAPE. Dr. Bijaya Bhusan Nanda,

MEASURES OF DISPERSION, RELATIVE STANDING AND SHAPE. Dr. Bijaya Bhusan Nanda, MEASURES OF DISPERSION, RELATIVE STANDING AND SHAPE Dr. Bijaya Bhusan Nanda, CONTENTS What is measures of dispersion? Why measures of dispersion? How measures of dispersions are calculated? Range Quartile

More information

Fundamentals of Statistics

Fundamentals of Statistics CHAPTER 4 Fundamentals of Statistics Expected Outcomes Know the difference between a variable and an attribute. Perform mathematical calculations to the correct number of significant figures. Construct

More information

MATHEMATICS APPLIED TO BIOLOGICAL SCIENCES MVE PA 07. LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1)

MATHEMATICS APPLIED TO BIOLOGICAL SCIENCES MVE PA 07. LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1) LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1) Descriptive statistics are ways of summarizing large sets of quantitative (numerical) information. The best way to reduce a set of

More information

Chapter 3 Descriptive Statistics: Numerical Measures Part A

Chapter 3 Descriptive Statistics: Numerical Measures Part A Slides Prepared by JOHN S. LOUCKS St. Edward s University Slide 1 Chapter 3 Descriptive Statistics: Numerical Measures Part A Measures of Location Measures of Variability Slide Measures of Location Mean

More information

ECON 214 Elements of Statistics for Economists

ECON 214 Elements of Statistics for Economists ECON 214 Elements of Statistics for Economists Session 3 Presentation of Data: Numerical Summary Measures Part 2 Lecturer: Dr. Bernardin Senadza, Dept. of Economics Contact Information: bsenadza@ug.edu.gh

More information

Engineering Mathematics III. Moments

Engineering Mathematics III. Moments Moments Mean and median Mean value (centre of gravity) f(x) x f (x) x dx Median value (50th percentile) F(x med ) 1 2 P(x x med ) P(x x med ) 1 0 F(x) x med 1/2 x x Variance and standard deviation

More information

STAT 113 Variability

STAT 113 Variability STAT 113 Variability Colin Reimer Dawson Oberlin College September 14, 2017 1 / 48 Outline Last Time: Shape and Center Variability Boxplots and the IQR Variance and Standard Deviaton Transformations 2

More information

DESCRIPTIVE STATISTICS

DESCRIPTIVE STATISTICS DESCRIPTIVE STATISTICS INTRODUCTION Numbers and quantification offer us a very special language which enables us to express ourselves in exact terms. This language is called Mathematics. We will now learn

More information

AP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE

AP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE AP STATISTICS Name: FALL SEMESTSER FINAL EXAM STUDY GUIDE Period: *Go over Vocabulary Notecards! *This is not a comprehensive review you still should look over your past notes, homework/practice, Quizzes,

More information

Lecture 1: Review and Exploratory Data Analysis (EDA)

Lecture 1: Review and Exploratory Data Analysis (EDA) Lecture 1: Review and Exploratory Data Analysis (EDA) Ani Manichaikul amanicha@jhsph.edu 16 April 2007 1 / 40 Course Information I Office hours For questions and help When? I ll announce this tomorrow

More information

Measures of Central Tendency: Ungrouped Data. Mode. Median. Mode -- Example. Median: Example with an Odd Number of Terms

Measures of Central Tendency: Ungrouped Data. Mode. Median. Mode -- Example. Median: Example with an Odd Number of Terms Measures of Central Tendency: Ungrouped Data Measures of central tendency yield information about particular places or locations in a group of numbers. Common Measures of Location Mode Median Percentiles

More information

Data Distributions and Normality

Data Distributions and Normality Data Distributions and Normality Definition (Non)Parametric Parametric statistics assume that data come from a normal distribution, and make inferences about parameters of that distribution. These statistical

More information

Moments and Measures of Skewness and Kurtosis

Moments and Measures of Skewness and Kurtosis Moments and Measures of Skewness and Kurtosis Moments The term moment has been taken from physics. The term moment in statistical use is analogous to moments of forces in physics. In statistics the values

More information

Numerical summary of data

Numerical summary of data Numerical summary of data Introduction to Statistics Measures of location: mode, median, mean, Measures of spread: range, interquartile range, standard deviation, Measures of form: skewness, kurtosis,

More information

Descriptive Statistics

Descriptive Statistics Chapter 3 Descriptive Statistics Chapter 2 presented graphical techniques for organizing and displaying data. Even though such graphical techniques allow the researcher to make some general observations

More information

Standardized Data Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis

Standardized Data Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis Descriptive Statistics (Part 2) 4 Chapter Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis McGraw-Hill/Irwin Copyright 2009 by The McGraw-Hill Companies, Inc. Chebyshev s Theorem

More information

Exploring Data and Graphics

Exploring Data and Graphics Exploring Data and Graphics Rick White Department of Statistics, UBC Graduate Pathways to Success Graduate & Postdoctoral Studies November 13, 2013 Outline Summarizing Data Types of Data Visualizing Data

More information

Describing Data: One Quantitative Variable

Describing Data: One Quantitative Variable STAT 250 Dr. Kari Lock Morgan The Big Picture Describing Data: One Quantitative Variable Population Sampling SECTIONS 2.2, 2.3 One quantitative variable (2.2, 2.3) Statistical Inference Sample Descriptive

More information

Some estimates of the height of the podium

Some estimates of the height of the podium Some estimates of the height of the podium 24 36 40 40 40 41 42 44 46 48 50 53 65 98 1 5 number summary Inter quartile range (IQR) range = max min 2 1.5 IQR outlier rule 3 make a boxplot 24 36 40 40 40

More information

1 Describing Distributions with numbers

1 Describing Distributions with numbers 1 Describing Distributions with numbers Only for quantitative variables!! 1.1 Describing the center of a data set The mean of a set of numerical observation is the familiar arithmetic average. To write

More information

NOTES: Chapter 4 Describing Data

NOTES: Chapter 4 Describing Data NOTES: Chapter 4 Describing Data Intro to Statistics COLYER Spring 2017 Student Name: Page 2 Section 4.1 ~ What is Average? Objective: In this section you will understand the difference between the three

More information

STATISTICAL DISTRIBUTIONS AND THE CALCULATOR

STATISTICAL DISTRIBUTIONS AND THE CALCULATOR STATISTICAL DISTRIBUTIONS AND THE CALCULATOR 1. Basic data sets a. Measures of Center - Mean ( ): average of all values. Characteristic: non-resistant is affected by skew and outliers. - Median: Either

More information

Terms & Characteristics

Terms & Characteristics NORMAL CURVE Knowledge that a variable is distributed normally can be helpful in drawing inferences as to how frequently certain observations are likely to occur. NORMAL CURVE A Normal distribution: Distribution

More information

David Tenenbaum GEOG 090 UNC-CH Spring 2005

David Tenenbaum GEOG 090 UNC-CH Spring 2005 Simple Descriptive Statistics Review and Examples You will likely make use of all three measures of central tendency (mode, median, and mean), as well as some key measures of dispersion (standard deviation,

More information

The Range, the Inter Quartile Range (or IQR), and the Standard Deviation (which we usually denote by a lower case s).

The Range, the Inter Quartile Range (or IQR), and the Standard Deviation (which we usually denote by a lower case s). We will look the three common and useful measures of spread. The Range, the Inter Quartile Range (or IQR), and the Standard Deviation (which we usually denote by a lower case s). 1 Ameasure of the center

More information

Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc.

Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc. Chapter 8 Measures of Center Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc. Data that can only be integer

More information

Establishing a framework for statistical analysis via the Generalized Linear Model

Establishing a framework for statistical analysis via the Generalized Linear Model PSY349: Lecture 1: INTRO & CORRELATION Establishing a framework for statistical analysis via the Generalized Linear Model GLM provides a unified framework that incorporates a number of statistical methods

More information

The Normal Distribution & Descriptive Statistics. Kin 304W Week 2: Jan 15, 2012

The Normal Distribution & Descriptive Statistics. Kin 304W Week 2: Jan 15, 2012 The Normal Distribution & Descriptive Statistics Kin 304W Week 2: Jan 15, 2012 1 Questionnaire Results I received 71 completed questionnaires. Thank you! Are you nervous about scientific writing? You re

More information

Key Objectives. Module 2: The Logic of Statistical Inference. Z-scores. SGSB Workshop: Using Statistical Data to Make Decisions

Key Objectives. Module 2: The Logic of Statistical Inference. Z-scores. SGSB Workshop: Using Statistical Data to Make Decisions SGSB Workshop: Using Statistical Data to Make Decisions Module 2: The Logic of Statistical Inference Dr. Tom Ilvento January 2006 Dr. Mugdim Pašić Key Objectives Understand the logic of statistical inference

More information

MEASURES OF CENTRAL TENDENCY & VARIABILITY + NORMAL DISTRIBUTION

MEASURES OF CENTRAL TENDENCY & VARIABILITY + NORMAL DISTRIBUTION MEASURES OF CENTRAL TENDENCY & VARIABILITY + NORMAL DISTRIBUTION 1 Day 3 Summer 2017.07.31 DISTRIBUTION Symmetry Modality 单峰, 双峰 Skewness 正偏或负偏 Kurtosis 2 3 CHAPTER 4 Measures of Central Tendency 集中趋势

More information

A LEVEL MATHEMATICS ANSWERS AND MARKSCHEMES SUMMARY STATISTICS AND DIAGRAMS. 1. a) 45 B1 [1] b) 7 th value 37 M1 A1 [2]

A LEVEL MATHEMATICS ANSWERS AND MARKSCHEMES SUMMARY STATISTICS AND DIAGRAMS. 1. a) 45 B1 [1] b) 7 th value 37 M1 A1 [2] 1. a) 45 [1] b) 7 th value 37 [] n c) LQ : 4 = 3.5 4 th value so LQ = 5 3 n UQ : 4 = 9.75 10 th value so UQ = 45 IQR = 0 f.t. d) Median is closer to upper quartile Hence negative skew [] Page 1 . a) Orders

More information

CSC Advanced Scientific Programming, Spring Descriptive Statistics

CSC Advanced Scientific Programming, Spring Descriptive Statistics CSC 223 - Advanced Scientific Programming, Spring 2018 Descriptive Statistics Overview Statistics is the science of collecting, organizing, analyzing, and interpreting data in order to make decisions.

More information

IOP 201-Q (Industrial Psychological Research) Tutorial 5

IOP 201-Q (Industrial Psychological Research) Tutorial 5 IOP 201-Q (Industrial Psychological Research) Tutorial 5 TRUE/FALSE [1 point each] Indicate whether the sentence or statement is true or false. 1. To establish a cause-and-effect relation between two variables,

More information

Lecture Week 4 Inspecting Data: Distributions

Lecture Week 4 Inspecting Data: Distributions Lecture Week 4 Inspecting Data: Distributions Introduction to Research Methods & Statistics 2013 2014 Hemmo Smit So next week No lecture & workgroups But Practice Test on-line (BB) Enter data for your

More information

Measures of Center. Mean. 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) Measure of Center. Notation. Mean

Measures of Center. Mean. 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) Measure of Center. Notation. Mean Measure of Center Measures of Center The value at the center or middle of a data set 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) 1 2 Mean Notation The measure of center obtained by adding the values

More information

Summarising Data. Summarising Data. Examples of Types of Data. Types of Data

Summarising Data. Summarising Data. Examples of Types of Data. Types of Data Summarising Data Summarising Data Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester Today we will consider Different types of data Appropriate ways to summarise these data 17/10/2017

More information

Graphical and Tabular Methods in Descriptive Statistics. Descriptive Statistics

Graphical and Tabular Methods in Descriptive Statistics. Descriptive Statistics Graphical and Tabular Methods in Descriptive Statistics MATH 3342 Section 1.2 Descriptive Statistics n Graphs and Tables n Numerical Summaries Sections 1.3 and 1.4 1 Why graph data? n The amount of data

More information

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1 1 Chapter 1 1.1 Definitions Stat 101 Exam 1 - Embers Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2.

More information

Percentiles, STATA, Box Plots, Standardizing, and Other Transformations

Percentiles, STATA, Box Plots, Standardizing, and Other Transformations Percentiles, STATA, Box Plots, Standardizing, and Other Transformations Lecture 3 Reading: Sections 5.7 54 Remember, when you finish a chapter make sure not to miss the last couple of boxes: What Can Go

More information

Data Analysis. BCF106 Fundamentals of Cost Analysis

Data Analysis. BCF106 Fundamentals of Cost Analysis Data Analysis BCF106 Fundamentals of Cost Analysis June 009 Chapter 5 Data Analysis 5.0 Introduction... 3 5.1 Terminology... 3 5. Measures of Central Tendency... 5 5.3 Measures of Dispersion... 7 5.4 Frequency

More information

Table of Contents. New to the Second Edition... Chapter 1: Introduction : Social Research...

Table of Contents. New to the Second Edition... Chapter 1: Introduction : Social Research... iii Table of Contents Preface... xiii Purpose... xiii Outline of Chapters... xiv New to the Second Edition... xvii Acknowledgements... xviii Chapter 1: Introduction... 1 1.1: Social Research... 1 Introduction...

More information

Chapter 2: Descriptive Statistics. Mean (Arithmetic Mean): Found by adding the data values and dividing the total by the number of data.

Chapter 2: Descriptive Statistics. Mean (Arithmetic Mean): Found by adding the data values and dividing the total by the number of data. -3: Measure of Central Tendency Chapter : Descriptive Statistics The value at the center or middle of a data set. It is a tool for analyzing data. Part 1: Basic concepts of Measures of Center Ex. Data

More information

Chapter 6 Simple Correlation and

Chapter 6 Simple Correlation and Contents Chapter 1 Introduction to Statistics Meaning of Statistics... 1 Definition of Statistics... 2 Importance and Scope of Statistics... 2 Application of Statistics... 3 Characteristics of Statistics...

More information

Copyright 2005 Pearson Education, Inc. Slide 6-1

Copyright 2005 Pearson Education, Inc. Slide 6-1 Copyright 2005 Pearson Education, Inc. Slide 6-1 Chapter 6 Copyright 2005 Pearson Education, Inc. Measures of Center in a Distribution 6-A The mean is what we most commonly call the average value. It is

More information

Chapter 4 Variability

Chapter 4 Variability Chapter 4 Variability PowerPoint Lecture Slides Essentials of Statistics for the Behavioral Sciences Seventh Edition by Frederick J Gravetter and Larry B. Wallnau Chapter 4 Learning Outcomes 1 2 3 4 5

More information

NCSS Statistical Software. Reference Intervals

NCSS Statistical Software. Reference Intervals Chapter 586 Introduction A reference interval contains the middle 95% of measurements of a substance from a healthy population. It is a type of prediction interval. This procedure calculates one-, and

More information

Descriptive Statistics (Devore Chapter One)

Descriptive Statistics (Devore Chapter One) Descriptive Statistics (Devore Chapter One) 1016-345-01 Probability and Statistics for Engineers Winter 2010-2011 Contents 0 Perspective 1 1 Pictorial and Tabular Descriptions of Data 2 1.1 Stem-and-Leaf

More information

Descriptive Statistics Bios 662

Descriptive Statistics Bios 662 Descriptive Statistics Bios 662 Michael G. Hudgens, Ph.D. mhudgens@bios.unc.edu http://www.bios.unc.edu/ mhudgens 2008-08-19 08:51 BIOS 662 1 Descriptive Statistics Descriptive Statistics Types of variables

More information

Section 6-1 : Numerical Summaries

Section 6-1 : Numerical Summaries MAT 2377 (Winter 2012) Section 6-1 : Numerical Summaries With a random experiment comes data. In these notes, we learn techniques to describe the data. Data : We will denote the n observations of the random

More information

Dot Plot: A graph for displaying a set of data. Each numerical value is represented by a dot placed above a horizontal number line.

Dot Plot: A graph for displaying a set of data. Each numerical value is represented by a dot placed above a horizontal number line. Introduction We continue our study of descriptive statistics with measures of dispersion, such as dot plots, stem and leaf displays, quartiles, percentiles, and box plots. Dot plots, a stem-and-leaf display,

More information

Applications of Data Dispersions

Applications of Data Dispersions 1 Applications of Data Dispersions Key Definitions Standard Deviation: The standard deviation shows how far away each value is from the mean on average. Z-Scores: The distance between the mean and a given

More information

Quantitative Methods for Economics, Finance and Management (A86050 F86050)

Quantitative Methods for Economics, Finance and Management (A86050 F86050) Quantitative Methods for Economics, Finance and Management (A86050 F86050) Matteo Manera matteo.manera@unimib.it Marzio Galeotti marzio.galeotti@unimi.it 1 This material is taken and adapted from Guy Judge

More information

Math 227 Elementary Statistics. Bluman 5 th edition

Math 227 Elementary Statistics. Bluman 5 th edition Math 227 Elementary Statistics Bluman 5 th edition CHAPTER 6 The Normal Distribution 2 Objectives Identify distributions as symmetrical or skewed. Identify the properties of the normal distribution. Find

More information

KARACHI UNIVERSITY BUSINESS SCHOOL UNIVERSITY OF KARACHI BS (BBA) VI

KARACHI UNIVERSITY BUSINESS SCHOOL UNIVERSITY OF KARACHI BS (BBA) VI 88 P a g e B S ( B B A ) S y l l a b u s KARACHI UNIVERSITY BUSINESS SCHOOL UNIVERSITY OF KARACHI BS (BBA) VI Course Title : STATISTICS Course Number : BA(BS) 532 Credit Hours : 03 Course 1. Statistical

More information

NOTES ON THE BANK OF ENGLAND OPTION IMPLIED PROBABILITY DENSITY FUNCTIONS

NOTES ON THE BANK OF ENGLAND OPTION IMPLIED PROBABILITY DENSITY FUNCTIONS 1 NOTES ON THE BANK OF ENGLAND OPTION IMPLIED PROBABILITY DENSITY FUNCTIONS Options are contracts used to insure against or speculate/take a view on uncertainty about the future prices of a wide range

More information

Statistics I Chapter 2: Analysis of univariate data

Statistics I Chapter 2: Analysis of univariate data Statistics I Chapter 2: Analysis of univariate data Numerical summary Central tendency Location Spread Form mean quartiles range coeff. asymmetry median percentiles interquartile range coeff. kurtosis

More information

Monte Carlo Simulation (Random Number Generation)

Monte Carlo Simulation (Random Number Generation) Monte Carlo Simulation (Random Number Generation) Revised: 10/11/2017 Summary... 1 Data Input... 1 Analysis Options... 6 Summary Statistics... 6 Box-and-Whisker Plots... 7 Percentiles... 9 Quantile Plots...

More information

CABARRUS COUNTY 2008 APPRAISAL MANUAL

CABARRUS COUNTY 2008 APPRAISAL MANUAL STATISTICS AND THE APPRAISAL PROCESS PREFACE Like many of the technical aspects of appraising, such as income valuation, you have to work with and use statistics before you can really begin to understand

More information

Random Variables and Probability Distributions

Random Variables and Probability Distributions Chapter 3 Random Variables and Probability Distributions Chapter Three Random Variables and Probability Distributions 3. Introduction An event is defined as the possible outcome of an experiment. In engineering

More information

STAT 157 HW1 Solutions

STAT 157 HW1 Solutions STAT 157 HW1 Solutions http://www.stat.ucla.edu/~dinov/courses_students.dir/10/spring/stats157.dir/ Problem 1. 1.a: (6 points) Determine the Relative Frequency and the Cumulative Relative Frequency (fill

More information

Lectures delivered by Prof.K.K.Achary, YRC

Lectures delivered by Prof.K.K.Achary, YRC Lectures delivered by Prof.K.K.Achary, YRC Given a data set, we say that it is symmetric about a central value if the observations are distributed symmetrically about the central value. In symmetrically

More information

Measures of Central Tendency Lecture 5 22 February 2006 R. Ryznar

Measures of Central Tendency Lecture 5 22 February 2006 R. Ryznar Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar Today s Content Wrap-up from yesterday Frequency Distributions The Mean, Median and Mode Levels of Measurement and Measures of Central

More information

Data screening, transformations: MRC05

Data screening, transformations: MRC05 Dale Berger Data screening, transformations: MRC05 This is a demonstration of data screening and transformations for a regression analysis. Our interest is in predicting current salary from education level

More information

AP Statistics Chapter 6 - Random Variables

AP Statistics Chapter 6 - Random Variables AP Statistics Chapter 6 - Random 6.1 Discrete and Continuous Random Objective: Recognize and define discrete random variables, and construct a probability distribution table and a probability histogram

More information

Sampling and Descriptive Statistics

Sampling and Descriptive Statistics Sampling and Descriptive Statistics Berlin Chen Department of Computer Science & Information Engineering National Taiwan Normal University Reference: 1. W. Navidi. Statistics for Engineering and Scientists.

More information

Wk 2 Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12)

Wk 2 Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Wk 2 Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics: - Measures of centrality (Mean, median, mode, trimmed mean) - Measures of spread (MAD, Standard deviation, variance) -

More information

MBEJ 1023 Dr. Mehdi Moeinaddini Dept. of Urban & Regional Planning Faculty of Built Environment

MBEJ 1023 Dr. Mehdi Moeinaddini Dept. of Urban & Regional Planning Faculty of Built Environment MBEJ 1023 Planning Analytical Methods Dr. Mehdi Moeinaddini Dept. of Urban & Regional Planning Faculty of Built Environment Contents What is statistics? Population and Sample Descriptive Statistics Inferential

More information

2011 Pearson Education, Inc

2011 Pearson Education, Inc Statistics for Business and Economics Chapter 4 Random Variables & Probability Distributions Content 1. Two Types of Random Variables 2. Probability Distributions for Discrete Random Variables 3. The Binomial

More information

The normal distribution is a theoretical model derived mathematically and not empirically.

The normal distribution is a theoretical model derived mathematically and not empirically. Sociology 541 The Normal Distribution Probability and An Introduction to Inferential Statistics Normal Approximation The normal distribution is a theoretical model derived mathematically and not empirically.

More information

Hypothesis Tests: One Sample Mean Cal State Northridge Ψ320 Andrew Ainsworth PhD

Hypothesis Tests: One Sample Mean Cal State Northridge Ψ320 Andrew Ainsworth PhD Hypothesis Tests: One Sample Mean Cal State Northridge Ψ320 Andrew Ainsworth PhD MAJOR POINTS Sampling distribution of the mean revisited Testing hypotheses: sigma known An example Testing hypotheses:

More information

2 DESCRIPTIVE STATISTICS

2 DESCRIPTIVE STATISTICS Chapter 2 Descriptive Statistics 47 2 DESCRIPTIVE STATISTICS Figure 2.1 When you have large amounts of data, you will need to organize it in a way that makes sense. These ballots from an election are rolled

More information

NOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS

NOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS NOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS A box plot is a pictorial representation of the data and can be used to get a good idea and a clear picture about the distribution of the data. It shows

More information

Putting Things Together Part 2

Putting Things Together Part 2 Frequency Putting Things Together Part These exercise blend ideas from various graphs (histograms and boxplots), differing shapes of distributions, and values summarizing the data. Data for, and are in

More information

starting on 5/1/1953 up until 2/1/2017.

starting on 5/1/1953 up until 2/1/2017. An Actuary s Guide to Financial Applications: Examples with EViews By William Bourgeois An actuary is a business professional who uses statistics to determine and analyze risks for companies. In this guide,

More information

Lecture Data Science

Lecture Data Science Web Science & Technologies University of Koblenz Landau, Germany Lecture Data Science Statistics Foundations JProf. Dr. Claudia Wagner Learning Goals How to describe sample data? What is mode/median/mean?

More information