Averages and Variability. Aplia (week 3 Measures of Central Tendency) Measures of central tendency (averages)

Similar documents
Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution

David Tenenbaum GEOG 090 UNC-CH Spring 2005

Measures of Variation. Section 2-5. Dotplots of Waiting Times. Waiting Times of Bank Customers at Different Banks in minutes. Bank of Providence

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

Numerical Descriptions of Data

Statistics vs. statistics

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1

IOP 201-Q (Industrial Psychological Research) Tutorial 5

Chapter 3 Descriptive Statistics: Numerical Measures Part A

Descriptive Analysis

Lecture 18 Section Mon, Feb 16, 2009

Lecture 18 Section Mon, Sep 29, 2008

Numerical Measurements

Numerical Descriptive Measures. Measures of Center: Mean and Median

Basic Procedure for Histograms

VARIABILITY: Range Variance Standard Deviation

STAT 113 Variability

3.1 Measures of Central Tendency

Measures of Center. Mean. 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) Measure of Center. Notation. Mean

Chapter 4 Variability

Unit 2 Statistics of One Variable

Week 7. Texas A& M University. Department of Mathematics Texas A& M University, College Station Section 3.2, 3.3 and 3.4

Normal Probability Distributions

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment

Frequency Distribution and Summary Statistics

Lecture 2 Describing Data

CSC Advanced Scientific Programming, Spring Descriptive Statistics

Chapter 3. Descriptive Measures. Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 3, Slide 1

Chapter 6: The Normal Distribution

Measures of Dispersion (Range, standard deviation, standard error) Introduction

Fundamentals of Statistics

Simple Descriptive Statistics

1 Describing Distributions with numbers

Today s plan: Section 4.1.4: Dispersion: Five-Number summary and Standard Deviation.

Normal Model (Part 1)

Description of Data I

The Mode: An Example. The Mode: An Example. Measure of Central Tendency: The Mode. Measure of Central Tendency: The Median

Statistics for Managers Using Microsoft Excel 7 th Edition

Chapter 6: The Normal Distribution

A LEVEL MATHEMATICS ANSWERS AND MARKSCHEMES SUMMARY STATISTICS AND DIAGRAMS. 1. a) 45 B1 [1] b) 7 th value 37 M1 A1 [2]

STATS DOESN T SUCK! ~ CHAPTER 4

Counting Basics. Venn diagrams

Section Distributions of Random Variables

In a binomial experiment of n trials, where p = probability of success and q = probability of failure. mean variance standard deviation

Chapter 3: Displaying and Describing Quantitative Data Quiz A Name

MA131 Lecture 8.2. The normal distribution curve can be considered as a probability distribution curve for normally distributed variables.

Central Limit Theorem

Quantitative Analysis and Empirical Methods

Center and Spread. Measures of Center and Spread. Example: Mean. Mean: the balance point 2/22/2009. Describing Distributions with Numbers.

Statistics 114 September 29, 2012

Chapter 3: Probability Distributions and Statistics

Statistics for Managers Using Microsoft Excel/SPSS Chapter 6 The Normal Distribution And Other Continuous Distributions

Computing Statistics ID1050 Quantitative & Qualitative Reasoning

Lecture 9. Probability Distributions. Outline. Outline

Copyright 2005 Pearson Education, Inc. Slide 6-1

Handout 4 numerical descriptive measures part 2. Example 1. Variance and Standard Deviation for Grouped Data. mf N 535 = = 25

Chapter 7. Sampling Distributions

Measure of Variation

Chapter 2: Descriptive Statistics. Mean (Arithmetic Mean): Found by adding the data values and dividing the total by the number of data.

Section Introduction to Normal Distributions

Lecture 9. Probability Distributions

appstats5.notebook September 07, 2016 Chapter 5

MgtOp 215 TEST 1 (Golden) Spring 2016 Dr. Ahn. Read the following instructions very carefully before you start the test.

Data Analysis. BCF106 Fundamentals of Cost Analysis

Chapter 6. y y. Standardizing with z-scores. Standardizing with z-scores (cont.)

Random Variables and Probability Distributions

Statistics I Chapter 2: Analysis of univariate data

Descriptive Statistics

Lecture 1: Review and Exploratory Data Analysis (EDA)

CABARRUS COUNTY 2008 APPRAISAL MANUAL

Measures of Central tendency

Hypothesis Tests: One Sample Mean Cal State Northridge Ψ320 Andrew Ainsworth PhD

CHAPTER 2 Describing Data: Numerical

ECON 214 Elements of Statistics for Economists

Section3-2: Measures of Center

Econ 6900: Statistical Problems. Instructor: Yogesh Uppal

Chapter 3 - Lecture 3 Expected Values of Discrete Random Va

Population Mean GOALS. Characteristics of the Mean. EXAMPLE Population Mean. Parameter Versus Statistics. Describing Data: Numerical Measures

Section Random Variables and Histograms

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Section Distributions of Random Variables

PSYCHOLOGICAL STATISTICS

Chapter 5 Discrete Probability Distributions. Random Variables Discrete Probability Distributions Expected Value and Variance

MEASURES OF DISPERSION, RELATIVE STANDING AND SHAPE. Dr. Bijaya Bhusan Nanda,

The normal distribution is a theoretical model derived mathematically and not empirically.

DATA SUMMARIZATION AND VISUALIZATION

Some Characteristics of Data

6.2 Normal Distribution. Normal Distributions

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1

2 DESCRIPTIVE STATISTICS

Skewness and the Mean, Median, and Mode *

Discrete Probability Distribution

Chapter 7 Study Guide: The Central Limit Theorem

Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc.

Quantitative Methods for Economics, Finance and Management (A86050 F86050)

STAT 157 HW1 Solutions

The Range, the Inter Quartile Range (or IQR), and the Standard Deviation (which we usually denote by a lower case s).

Refer to Ex 3-18 on page Record the info for Brand A in a column. Allow 3 adjacent other columns to be added. Do the same for Brand B.

Transcription:

Chapter 4 Averages and Variability Aplia (week 3 Measures of Central Tendency) Chapter 5 (omit 5.2, 5.6, 5.8, 5.9) Aplia (week 4 Measures of Variability) Measures of central tendency (averages) Measures of variability Two critical types of descriptive statistics for data from quantitative variables 1

Mode Measures of Central Tendency most frequent score in a set of data not the frequency of that score example: stress ratings from control group 4 2 5 4 6 3 4 2 4 5 3 5 4 6 2 7 3 5 7 6 mode = 4 Control Yoga Rating Freq. Rating Freq. 1 0 1 1 2 3 2 5 3 3 3 5 4 5 4 6 5 4 5 1 6 3 6 1 mode =? 7 2 7 1 2

Median Measures of Central Tendency middle score in a rank-ordered list of scores 50% of scores above and 50% below the median example: stress ratings from control group (N = 20) 2 2 2 3 3 3 4 4 4 4 4 5 5 5 5 6 6 6 7 7 median =? Control Yoga Rating Freq. Rating Freq. 1 0 1 1 2 3 2 5 3 3 3 5 4 5 4 6 5 4 5 1 median =? 6 3 6 1 7 2 7 1 3

Measures of Central Tendency Mean typical concept of average X = M = X N example: stress ratings from control group 2 2 2 3 3 3 4 4 4 4 4 5 5 5 5 6 6 6 7 7 M = 87 20 = 4.35 example: stress ratings from yoga group 4 5 2 4 6 4 2 7 2 3 4 3 3 4 3 2 1 2 3 4 M = 68 20 = 3.40 4

Measures of Central Tendency Mean balance point analogy (mean = fulcrum) scores arrayed on a long board according to score value, as in a histogram mean represents the balance point, or fulcrum Group 1 Group 2 0 1 2 3 4 5 6 7 8 4.35 0 1 2 3 4 5 6 7 8 3.40 5

Mean Measures of Central Tendency implications of balance point analogy sensitivity to extreme score example: change a 7 to a 1 Group 1 Group 1 0 1 2 3 4 5 6 7 8 4.35 0 1 2 3 4 5 6 7 8 M = 81 20 = 4.05 6

Mean Measures of Central Tendency implications of balance point analogy sensitivity to extreme score example: change a 7 to a 1 Group 1 Group 1 0 1 2 3 4 5 6 7 8 4.35 mode and median do not change in this example 0 1 2 3 4 5 6 7 8 4.05 M = 81 20 = 4.05 7

Measures of Central Tendency Mean implications of balance point analogy deviations from mean sum to zero example: first 10 scores from yoga group 4 5 2 4 6 4 2 7 2 3 M = 39 10 = 3.90 4 3.9 = 0.1 5 3.9 = 1.1 2 3.9 = 1.9 4 3.9 = 0.1 6 3.9 = 2.1 4 3.9 = 0.1 2 3.9 = 1.9 7 3.9 = 3.1 2 3.9 = 1.9 3 3.9 = 0.9 Σ(X M) = 0 8

Measures of Central Tendency Comparing measures of central tendency similar value for mean, median, and mode with symmetrical, unimodal distribution example: normal distribution mean, median, mode 9

Measures of Central Tendency Comparing measures of central tendency the three measures take on different values with skewed distributions example: positively skewed distribution 10

Measures of Central Tendency Comparing measures of central tendency the three measures take on different values with skewed distributions example: positively skewed distribution illustration with response time data from congruent color naming trials 12 Congruent with synaesthesia 10 case Frequency 8 6 4 2 0 524.5 624.5 724.5 824.5 924.5 1024.5 1124.5 1224.5 Response Time 11

Measures of Central Tendency Data from a positively skewed distribution 582 589 592 596 597 601 614 624 631 633 636 638 647 651 652 653 655 666 670 673 680 681 687 692 704 719 722 726 733 735 739 741 745 746 748 760 763 768 782 789 795 811 822 829 883 928 938 Frequency 12 10 8 6 4 2 0 524.5 624.5 724.5 824.5 924.5 1024.5 1124.5 1224.5 median? mean? Congruent Response Time 12

Measures of Central Tendency Data from a positively skewed distribution 582 589 592 596 597 601 614 624 631 633 636 638 647 651 652 653 655 666 670 673 680 681 687 692 704 719 722 726 733 735 739 741 745 746 748 760 763 768 782 789 795 811 822 829 883 928 938 Frequency 12 10 8 6 4 2 0 524.5 624.5 724.5 824.5 924.5 1024.5 1124.5 1224.5 692 median? mean? Congruent Response Time 708 13

Measures of Central Tendency Data from incongruent condition stronger skew 579 591 594 595 600 613 613 652 662 668 674 677 679 682 690 691 707 726 743 753 755 756 758 761 769 778 787 801 801 805 814 835 853 859 860 867 872 893 893 897 898 951 973 1038 1156 1234 12 10 8 6 4 2 0 499.5 699.5 899.5 1099.5 1299.5 Response Time median? mean? 759.5 779.4 Incongruent 14

Measures of Central Tendency Using measures of central tendency emphasis on mean algebraically tractable when deriving statistical procedures most accurate sample-based estimate of its population value median is the best option for rank ordered data mode is the only measure that can be used with nominal data 15

Variability Variation among scores in a distribution variation due to individual differences, imperfect reliability in measurement variation provides important information in addition to central tendency compare two sets of scores with identical means ratings on a 10-point scale of 5 independent judges of aggressiveness exhibited by a child situation 1: 5 5 6 6 6 M = 5.6 situation 2: 2 3 6 7 10 M = 5.6 16

Variability Variation among scores in a distribution situation 1: 5 5 6 6 6 M = 5.6 situation 2: 2 3 6 7 10 M = 5.6 Measures of variability are based on distances between scores Range (highest lowest) situation 1: 6 5 = 1 situation 2: 10 2 = 8 17

Variability Variance average squared deviation from the mean population variance σ 2 = (X µ)2 N sample variance s 2 = (X M )2 N 1 18

Variability Variance: application to aggressiveness rating data situation 1 situation 2 5 5 6 6 6 M = 5.6 2 3 6 7 10 M = 5.6 X M (X M) (X M) 2 X M (X M) (X M) 2 5 5.6 0.6 0.36 5 5.6 0.6 0.36 6 5.6 0.4 0.16 6 5.6 0.4 0.16 6 5.6 0.4 0.16 (X M ) 2 2 5.6 3.6 12.96 3 5.6 2.6 6.76 6 5.6 0.4 0.16 7 5.6 1.4 1.96 10 5.6 4.4 19.36 = 41.20 = 1.20 (X M ) 2 s 2 = 1.20 4 = 0.30 s 2 = 41.20 4 = 10.30 19

Variability Standard deviation variance is a measure in squared units take square root to obtain a measure in original units σ = σ 2 (X µ) 2 = s = s 2 (X M ) 2 = N N 1 situation 1 situation 2 s 2 = 0.30 s 2 = 10.30 s = 0.30 = 0.55 s = 10.30 = 3.21 rough concept of standard deviation average distance of scores from mean 20

Comprehension check Variability relative to a normal distribution, is variance smaller, larger, or about the same in a uniform distribution with the same mean and range? Note: σ 2 = (X µ)2 N 21

Comprehension check Variability same question for bimodal distribution relative to a normal distribution 22

Comprehension check Variability combine two normal distributions what is the variance of the resulting distribution relative to the variance of the original distributions? Case 1: identical distributions 23

Comprehension check Variability combine two normal distributions what is the variance of the resulting distribution relative to the variance of the original distributions? Case 2: one set of scores larger than the other 24

Variability Rationale for using N 1 in formula for s 2 objective is to obtain an unbiased estimate of σ 2 s 2 based on deviations from M, not µ within a sample, an extreme score will pull M toward it, generating a deviation between M and that extreme score that is somewhat smaller than the deviation between µ and that score the use of M instead of µ is the reason division by N would produce an underestimate of σ 2 25

Variability Illustration of what happens when variance is estimated from a sample suppose we have a normally distributed population with µ = 10 and σ 2 = 9 a random sample of 5 scores is drawn: 7, 8, 8, 9, 14 compute the sum of squared deviations from µ and from M 26

Variability X µ (X µ) 2 7 10 9 8 10 4 8 10 4 9 10 1 14 10 16 Σ(X µ) 2 = 34 X M (X M) 2 7 9.2 4.84 8 9.2 1.44 8 9.2 1.44 9 9.2 0.04 14 9.2 23.04 Σ(X M) 2 = 30.8 a sample of deviations based on a sample mean usually will be smaller than when based on µ demonstration with R 27

Variability R code set.seed(378) vars=null vars=replicate(100000,c(vars,var(rnorm(5,10,3)))) head(vars) hist(vars,100,xlim=c(0,80)) mean(vars) set.seed(378) nvars=null nvars=replicate(100000,c(nvars,((4/5)*var(rnorm(5,10,3))))) head(nvars) hist(nvars,100,xlim=c(0,80)) mean(nvars) 28

Variability Variance computed with N 1 > mean(vars) [1] 9.005985 Variance computed with N > mean(vars) [1] 7.204788 29