Fundamentals of Statistics

Similar documents
PSYCHOLOGICAL STATISTICS

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1

David Tenenbaum GEOG 090 UNC-CH Spring 2005

Descriptive Statistics

Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution

Some Characteristics of Data

Engineering Mathematics III. Moments

Frequency Distribution and Summary Statistics

CSC Advanced Scientific Programming, Spring Descriptive Statistics

Numerical Descriptions of Data

Basic Procedure for Histograms

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment

3.1 Measures of Central Tendency

Numerical Measurements

Descriptive Analysis

Measures of Dispersion (Range, standard deviation, standard error) Introduction

The Normal Distribution

Measures of Central tendency

Module Tag PSY_P2_M 7. PAPER No.2: QUANTITATIVE METHODS MODULE No.7: NORMAL DISTRIBUTION

2 Exploring Univariate Data

9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives

Dot Plot: A graph for displaying a set of data. Each numerical value is represented by a dot placed above a horizontal number line.

DESCRIPTIVE STATISTICS

Data Distributions and Normality

Simple Descriptive Statistics

Descriptive Statistics

Lecture Week 4 Inspecting Data: Distributions

IOP 201-Q (Industrial Psychological Research) Tutorial 5

MEASURES OF DISPERSION, RELATIVE STANDING AND SHAPE. Dr. Bijaya Bhusan Nanda,

Summarising Data. Summarising Data. Examples of Types of Data. Types of Data

Graphical and Tabular Methods in Descriptive Statistics. Descriptive Statistics

DATA SUMMARIZATION AND VISUALIZATION

STATISTICS KEY POINTS

SUMMARY STATISTICS EXAMPLES AND ACTIVITIES

Basic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 3: April 25, Abstract

Unit 2 Statistics of One Variable

Chapter 6 Simple Correlation and

MAS1403. Quantitative Methods for Business Management. Semester 1, Module leader: Dr. David Walshaw

MATHEMATICS APPLIED TO BIOLOGICAL SCIENCES MVE PA 07. LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1)

DESCRIPTIVE STATISTICS II. Sorana D. Bolboacă

KING FAHD UNIVERSITY OF PETROLEUM & MINERALS DEPARTMENT OF MATHEMATICAL SCIENCES DHAHRAN, SAUDI ARABIA. Name: ID# Section

Chapter 3 Descriptive Statistics: Numerical Measures Part A

Moments and Measures of Skewness and Kurtosis

Description of Data I

Master of Science in Strategic Management Degree Master of Science in Strategic Supply Chain Management Degree

E.D.A. Exploratory Data Analysis E.D.A. Steps for E.D.A. Greg C Elvers, Ph.D.

Lecture 2 Describing Data

Measures of Central Tendency: Ungrouped Data. Mode. Median. Mode -- Example. Median: Example with an Odd Number of Terms

Measures of Central Tendency Lecture 5 22 February 2006 R. Ryznar

Normal Probability Distributions

MAS187/AEF258. University of Newcastle upon Tyne

1 Exercise One. 1.1 Calculate the mean ROI. Note that the data is not grouped! Below you find the raw data in tabular form:

Statistics for Managers Using Microsoft Excel/SPSS Chapter 6 The Normal Distribution And Other Continuous Distributions

Continuous Probability Distributions

Measures of Center. Mean. 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) Measure of Center. Notation. Mean

Random Variables and Probability Distributions

NCSS Statistical Software. Reference Intervals

STAT 113 Variability

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1

UNIT 4 NORMAL DISTRIBUTION: DEFINITION, CHARACTERISTICS AND PROPERTIES

Statistics 114 September 29, 2012

Lectures delivered by Prof.K.K.Achary, YRC

CHAPTER 2 Describing Data: Numerical

Descriptive Statistics Bios 662

KARACHI UNIVERSITY BUSINESS SCHOOL UNIVERSITY OF KARACHI BS (BBA) VI

Terms & Characteristics

Review: Types of Summary Statistics

Statistics I Chapter 2: Analysis of univariate data

The Mode: An Example. The Mode: An Example. Measure of Central Tendency: The Mode. Measure of Central Tendency: The Median

DATA HANDLING Five-Number Summary

MBEJ 1023 Dr. Mehdi Moeinaddini Dept. of Urban & Regional Planning Faculty of Built Environment

A LEVEL MATHEMATICS ANSWERS AND MARKSCHEMES SUMMARY STATISTICS AND DIAGRAMS. 1. a) 45 B1 [1] b) 7 th value 37 M1 A1 [2]

Monte Carlo Simulation (Random Number Generation)

Establishing a framework for statistical analysis via the Generalized Linear Model

ECON 214 Elements of Statistics for Economists

Averages and Variability. Aplia (week 3 Measures of Central Tendency) Measures of central tendency (averages)

Skewness and the Mean, Median, and Mode *

value BE.104 Spring Biostatistics: Distribution and the Mean J. L. Sherley

chapter 2-3 Normal Positive Skewness Negative Skewness

Introduction to Descriptive Statistics

Quantitative Analysis and Empirical Methods

AP Statistics Chapter 6 - Random Variables

Contents. An Overview of Statistical Applications CHAPTER 1. Contents (ix) Preface... (vii)

Chapter ! Bell Shaped

CHAPTER 5 Sampling Distributions

Standardized Data Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis

Exploring Data and Graphics

Categorical. A general name for non-numerical data; the data is separated into categories of some kind.

4. DESCRIPTIVE STATISTICS

32.S [F] SU 02 June All Syllabus Science Faculty B.A. I Yr. Stat. [Opt.] [Sem.I & II] 1

Math 227 Elementary Statistics. Bluman 5 th edition

Numerical Descriptive Measures. Measures of Center: Mean and Median

AP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE

CHAPTER 6. ' From the table the z value corresponding to this value Z = 1.96 or Z = 1.96 (d) P(Z >?) =

Modern Methods of Data Analysis - SS 2009

Lecture 1: Review and Exploratory Data Analysis (EDA)

34.S-[F] SU-02 June All Syllabus Science Faculty B.Sc. I Yr. Stat. [Opt.] [Sem.I & II] - 1 -

Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc.

2011 Pearson Education, Inc

Transcription:

CHAPTER 4 Fundamentals of Statistics Expected Outcomes Know the difference between a variable and an attribute. Perform mathematical calculations to the correct number of significant figures. Construct histograms for simple and complex data. Calculate and effectively use the different measures of central tendency, dispersion, and how related

Definition of Statistics: Introduction 1. A collection of quantitative data pertaining to to a subject or group. Examples are blood pressure statistics etc. 2. The science that deals with the collection, tabulation, analysis, interpretation, and presentation of quantitative data 2

Types of Data: Attribute: Collection of Data Discrete data. Data values can only be integers. Counted data or attribute data. Examples include: How many of the products are defective? How often are the machines repaired? How many people are absent each day? 3

Precision Precision description of a level of measurement that yields consistent results when repeated. It is associated with the concept of "random error", a form of observational error that leads to measurable values being inconsistent when repeated. 4

Accuracy Accuracy The more common definition is that accuracy is a level of measurement with no inherent limitation The ISO definition is that accuracy is a level of measurement that yields true (no systematic errors) and consistent (no random errors) results. 5

Describing Data Frequency Distribution: Three types--categorical, Ungrouped, & Grouped Categorical frequency distributions Data that can be placed in specific categories, such as nominal- or ordinal-level data. 6

Categorical 7

Ungrouped Ungrouped frequency distributions Ungrouped frequency distributions - can be used for data that can be enumerated and when the range of values in the data set is not large. 8

Grouped Grouped frequency distributions Can be used when the range of values in the data set is very large. The data must be grouped into classes that are more than one unit in width. 9

Frequency Distributions Number non conforming Frequency Relative Frequency Cumulative Frequency Relative 0 15 0.29 15 0.29 1 20 0.38 35 0.67 2 8 0.15 43 0.83 3 5 0.10 48 0.92 4 3 0.06 51 0.98 5 1 0.02 52 1.00 Frequency 10

Frequency Frequency Histogram 25 20 15 10 5 0 0 1 2 3 4 5 Number Nonconforming 11

The Histogram The histogram is the most important graphical tool for exploring the shape of data distributions. 12

Constructing a Histogram Step 1: Find range of distribution, largest - smallest values Step 2: Choose number of classes, 5 to 20 Step 3: Determine width of classes, one decimal place more than the data, class width = range/number of classes #classes n Step 4: Determine class boundaries Step 5: Draw frequency histogram 13

Constructing a Histogram Number of groups or cells If no. of observations < 100 5 to 9 cells Between 100-500 8 to 17 cells Greater than 500 15 to 20 cells 14

Other Types of Frequency Distribution Graphs Bar Graph Polygon of Data Cumulative Frequency Distribution or Ogive 15

Bar Graph and Polygon of Data 16

Cumulative Frequency 17

Characteristics of Frequency Distribution Graphs

Analysis of Histograms Figure 4-7 Differences due to location, spread, and shape 19

Measures of Central Tendency The three measures in common use are the: Average Median Mode 20

Average There are three different techniques available for calculating the average three measures in common use are the: Ungrouped data Grouped data Weighted average 21

Average-Ungrouped Data X n X i i1 n 22

Average-Grouped Data X h fx i i i1 n f X f X... f X. 1 1 2 2 f f... f 1 2 h h h h = number of cells Xi=midpoint fi=frequency 23

Average-Weighted Average Used when a number of averages are combined with different frequencies X w n i1 n wx i i i1 w i 24

Median-Grouped Data n m cf M 2 d Lm i f m Lm=lower boundary of the cell with the median N=total number of observations Cfm=cumulative frequency of all cells below m Fm=frequency of median cell i=cell interval 25

Mode The Mode is the value that occurs with the greatest frequency. It is possible to have no modes in a series or numbers or to have more than one mode. 26

Relationship Among the Measures of Central Tendency Figure 5-9 Relationship among average, median and mode 27

Measures of Dispersion Range Standard Deviation Variance 28

Measures of Dispersion-Range The range is the simplest and easiest to calculate of the measures of dispersion. Range = R = Xh - Xl Largest value - Smallest value in data set 29

30 Sample Standard Deviation: 2 1 ( ) 1 n i Xi X S n 2 2 1 1 / 1 n n i i Xi Xi n S n Measures of Dispersion-Standard Deviation

Standard Deviation Ungrouped Technique S n 2 n 2 n Xi ( Xi ) i1 i1 nn ( 1) 31

Standard Deviation Grouped Technique s h i1 h 2 2 i i i i i1 n ( f X ) ( f X ) nn ( 1) 32

Relationship Between the Measures of Dispersion As n increases, accuracy of R decreases Use R when there is small amount of data or data is too scattered If n> 10 use standard deviation A smaller standard deviation means better quality 33

Other Measures There are three other measures that are frequently used to analyze a collection of data: Skewness Kurtosis Coefficient of Variation 34

Skewness Skewness is the lack of symmetry of the data. For grouped data: h 1 i i a i 3 3 3 f ( X X ) / n s 35

Skewness 36

Kurtosis provides information regrading the shape of the population distribution (the peakedness or heaviness of the tails of a distribution). For grouped data: a Kurtosis h 1 i i i 4 4 4 f ( X X ) / n s 37

Kurtosis Figure 5-12 Leptokurtic and Platykurtic distributions 38

The Normal Curve Characteristics of the normal curve: It is symmetrical -- Half the cases are to one side of the center; the other half is on the other side. The distribution is single peaked, not bimodal or multi-modal Also known as the Gaussian distribution Mean is best measure of central tendency 39

Characteristics: The Normal Curve Most of the cases will fall in the center portion of the curve and as values of the variable become more extreme they become less frequent, with "outliers" at the "tail" of the distribution few in number. It is one of many frequency distributions. 40

Standard Normal Distribution The standard normal distribution is a normal distribution with a mean of 0 and a standard deviation of 1. Normal distributions can be transformed to standard normal distributions by the formula: Z X i 41

Standardized Normal Distribution with μ = 0 and σ = 1 42

Percent of Items Included between certain values of the standard deviation 43

Relationship between the Mean and Standard Deviation 44

Mean and Standard Deviation Same mean but different standard deviation 45

Tests for Normality Histogram Skewness Kurtosis 46

Histogram: Shape Symmetrical Tests for Normality The larger the sampler size, the better the judgment of normality. A minimum sample size of 50 is recommended 47

Skewness (a3) and Kurtosis (a4) Tests for Normality Skewed to the left or to the right (a3=0 for a normal distribution) The data are peaked as the normal distribution (a4=3 for a normal distribution) The larger the sample size, the better the judgment of normality (sample size of 100 is recommended) 48

Probability Plots Tests for Normality Order the data from the smallest to the largest Rank the observations (starting from 1 for the lowest observation) Calculate the plotting position PP 100( i 0.5) n Where i = rank PP=plotting position n=sample size 49

Probability Plots Procedure cont d: Order the data Rank the observations Calculate the plotting position Label the data scale Plot the points Attempt to fit by eye a best line Determine normality 50

Probability Plots 51