Maths/stats support 12 Spearman s rank correlation

Similar documents
Chapter 18: The Correlational Procedures

Subject: Psychopathy

Today's Agenda Hour 1 Correlation vs association, Pearson s R, non-linearity, Spearman rank correlation,

Review Exercise Set 13. Find the slope and the equation of the line in the following graph. If the slope is undefined, then indicate it as such.

The Spearman s Rank Correlation Test

Stat3011: Solution of Midterm Exam One

MAKING SENSE OF DATA Essentials series

ANSWERS CALCULATOR TEST Write down all the steps and calculations you make (even if it s on the calculator). Don t round until the end!

HPM Module_6_Capital_Budgeting_Exercise


STAB22 section 2.2. Figure 1: Plot of deforestation vs. price

(Refer Slide Time: 01:17)

Two-Sample T-Test for Superiority by a Margin

YEAR 12 Trial Exam Paper FURTHER MATHEMATICS. Written examination 1. Worked solutions

Two-Sample T-Test for Non-Inferiority

STATISTICAL DISTRIBUTIONS AND THE CALCULATOR

7. For the table that follows, answer the following questions: x y 1-1/4 2-1/2 3-3/4 4

DATA SUMMARIZATION AND VISUALIZATION

STAB22 section 1.3 and Chapter 1 exercises

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals

Economics 345 Applied Econometrics

CHAPTER 2 Describing Data: Numerical

Solutions to questions in Chapter 8 except those in PS4. The minimum-variance portfolio is found by applying the formula:

Survey of Math Chapter 21: Savings Models Handout Page 1

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing Examples

Expected Value of a Random Variable

FINITE MATH LECTURE NOTES. c Janice Epstein 1998, 1999, 2000 All rights reserved.

STAT 201 Chapter 6. Distribution

Lecture Data Science

Point-Biserial and Biserial Correlations

Since his score is positive, he s above average. Since his score is not close to zero, his score is unusual.

Chapter 8 Estimation

1 algebraic. expression. at least one operation. Any letter can be used as a variable. 2 + n. combination of numbers and variables

ECON 214 Elements of Statistics for Economists 2016/2017

DATA HANDLING Five-Number Summary

Sierra Environmental Studies Foundation

Confidence Intervals. σ unknown, small samples The t-statistic /22

STATE BANK OF PAKISTAN

THE UNIVERSITY OF TEXAS AT AUSTIN Department of Information, Risk, and Operations Management

Diploma Part 2. Quantitative Methods. Examiner s Suggested Answers

STAT 1220 FALL 2010 Common Final Exam December 10, 2010

Data Analysis and Statistical Methods Statistics 651

CSE 21 Winter 2016 Homework 6 Due: Wednesday, May 11, 2016 at 11:59pm. Instructions

MAY 2018 PROFESSIONAL EXAMINATIONS QUANTITATIVE TOOLS IN BUSINESS (PAPER 1.4) CHIEF EXAMINER S REPORT, QUESTIONS AND MARKING SCHEME

Chapter 1 Discussion Problem Solutions D1. D2. D3. D4. D5.

Survey of Math: Chapter 21: Consumer Finance Savings (Lecture 1) Page 1

STA 103: Final Exam. Print clearly on this exam. Only correct solutions that can be read will be given credit.

Terminology. Organizer of a race An institution, organization or any other form of association that hosts a racing event and handles its financials.

Edexcel past paper questions

UNIT 4 VOCABULARY: FRACTIONS

Lecture 16: Estimating Parameters (Confidence Interval Estimates of the Mean)

C03-Fundamentals of business mathematics

Probability. An intro for calculus students P= Figure 1: A normal integral

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Port(A,B) is a combination of two stocks, A and B, with standard deviations A and B. A,B = correlation (A,B) = 0.

Every data set has an average and a standard deviation, given by the following formulas,

Probability Notes: Binomial Probabilities

THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41202, Spring Quarter 2003, Mr. Ruey S. Tsay

25 Increasing and Decreasing Functions

In this section we revisit two special product forms that we learned in Chapter 5, the first of which was squaring a binomial.

Chapter 6: Supply and Demand with Income in the Form of Endowments

Leith Academy. Numeracy Booklet Pupil Version. A guide for S1 and S2 pupils, parents and staff

STA2601. Tutorial letter 105/2/2018. Applied Statistics II. Semester 2. Department of Statistics STA2601/105/2/2018 TRIAL EXAMINATION PAPER

CHAPTER 6 DATA ANALYSIS AND INTERPRETATION

Quantitative Methods

Empirical Project. Replication of Returns to Scale in Electricity Supply. by Marc Nerlove

ACCA F2 FLASH NOTES. Describe a pie chart?

Multiple linear regression

Statistics and Probability

Chapter 6 Diagnostic Test

Factoring. Difference of Two Perfect Squares (DOTS) Greatest Common Factor (GCF) Factoring Completely Trinomials. Factor Trinomials by Grouping

1. Confidence Intervals (cont.)

Statistics & Statistical Tests: Assumptions & Conclusions

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1

12.1 One-Way Analysis of Variance. ANOVA - analysis of variance - used to compare the means of several populations.

Binomial Probability

Exam 2 Spring 2015 Statistics for Applications 4/9/2015

The Binomial Distribution

Chapter 12. Homework. For each situation below, state the independent variable and the dependent variable.

ECON 214 Elements of Statistics for Economists

Capital Budgeting Decision Methods

Chapter 6.1 Confidence Intervals. Stat 226 Introduction to Business Statistics I. Chapter 6, Section 6.1

Statistics (This summary is for chapters 17, 28, 29 and section G of chapter 19)

Chapter 5: Summarizing Data: Measures of Variation

Quantitative Methods

Chapter 4: Estimation

Gyroscope Capital Management Group

Statistics TI-83 Usage Handout

Chapter 5 Student Lecture Notes 5-1. Department of Quantitative Methods & Information Systems. Business Statistics

Lecture 8: Single Sample t test

Simplifying and Combining Like Terms Exponent

XLSTAT TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION

We will use an example which will result in a paired t test regarding the labor force participation rate for women in the 60 s and 70 s.

CCAC ELEMENTARY ALGEBRA

SFSU FIN822 Project 1

M11/5/MATSD/SP2/ENG/TZ1/XX. mathematical STUDIES. Thursday 5 May 2011 (morning) 1 hour 30 minutes. instructions to candidates

Mount Olive High School Summer Assignment for AP Statistics

Tests for the Difference Between Two Linear Regression Intercepts

We take up chapter 7 beginning the week of October 16.

Transcription:

Maths/stats support 12 Spearman s rank correlation Using Spearman s rank correlation Use a Spearman s rank correlation test when you ve got two variables and you want to see if they are correlated. Your calculated Spearman s rank correlation coefficient (r s ) lets you test to see if you ve got a correlation between two variables, i.e. if a change in one variable is accompanied by a change in the other variable. It also measures the strength of any correlation. You need at least eight pairs of data in matched pairs. Being in matched pairs means that one piece of data is associated with only one other piece of data. For example, if you were measuring speed of hair growth at different ages, each hair growth measurement would belong with only one age measurement. If you mixed the matched pairs up the data would be meaningless. Types of correlation Beer consumption/barrels Temperature/ C Figure 1 A positive correlation. Positive correlations Figure 1 shows the (fictional) relationship between amount of beer drunk and temperature. It seems that the hotter it gets, the more beer is drunk. As one variable goes up so does the other. This is an example of a perfect positive correlation. If you calculated r s for these data you would get a value of 1 (plus one). Rarely will you get exactly 1 but strongly positively correlated variables will usually give you a value approaching 1. Happiness quotient Number of hours spent doing statistics Figure 2 A negative correlation. Negative correlations Figure 2 shows students happiness quotient plotted against the number of hours spent doing statistics. Extraordinary as it may seem, it would appear that the longer you spend doing statistics the more unhappy you become. As one variable (time spent doing stats) goes up, the other (happiness quotient) goes down. This is a perfect negative correlation. If you calculated r s for these data you would get a value of 1 (minus one). Rarely will you get exactly 1. Strongly negatively correlated variables will usually give you a value approaching 1. This sheet may have been altered from the original. 1of 5

No correlation Monthly sales of Bjork s Greatest Hits Density of limpets Figure 3 No correlation. Monthly sales of Hairy Otter and the Half-Blood Herring Height up the shore Figure 4 Spearman s rank correlation cannot be used on this type of data. This graph purports to show the monthly sales of Hairy Otter and the Half-Blood Herring in Reykjavik plotted against monthly Icelandic sales of Bjork s Greatest Hits. There does not seem to be any kind of relationship here. Changing one variable has no predictable effect on the other. There is no correlation. If you calculated r s for these data you would get a value of close to 0 (zero). Rarely will you get exactly 0. Uncorrelated variables will usually give you a value approaching 0. You should be aware that occasionally you might plot sales of Hairy Otter and the Half-Blood Herring against sales of Bjork s Greatest Hits and come up with a straight line. This does not necessarily imply a causal relationship. (This is also true of the other two examples.) As with all statistics the interpretation of the meaning of the data is up to you. That s the bit that most people find interesting. Remember also that you might have a relationship between two variables that does not take the form of a straight line. The example in Figure 4 is one such relationship. This type of correlation is beyond the scope of Spearman s rank correlation. Calculating Spearman s rank correlation coefficient: a worked example Using a fishing net, a spring balance and a metre rule, a student of above average strength collected the data in Table 1. Table 1 Length and mass of blue whales caught at Sunny Bay, Pembrokeshire. Length/m Mass/tonnes 1 1.5 1.5 2.3 2.3 3.6 3.4 7.1 4.4 2.6 4.6 13.3 6.2 7.5 7 11.2 7 12.1 8.7 11 10.5 12 12 18.2 Mass/tonnes 20 18 16 14 12 10 8 6 4 2 0 0 2 4 6 8 10 12 14 Length/m Figure 5 Mass of different length whales. This sheet may have been altered from the original. 2of 5

Plot the data Do blue whales get heavier as they get longer? It looks as if they do. The first step is to plot out the data to see if you have a straight-line graph. This has been done in Figure 5. There does appear to be a positive correlation between the length and mass of whales so you can proceed to the next stage, which is to calculate r s. Calculating r s 1: Rank the data Each set of data is ranked (put in order from lowest to highest). The lowest value is given the lowest rank value, 1. The next lowest value is given the next rank, 2, and so on. When two values are tied, for example the two lengths with a value of 7, they each get the average of the next two available ranks, 8 and 9 in this case, i.e. the same rank, 8.5. If you have more than two tied items of data, do the same thing: add up the appropriate ranks and share them equally between the tied items of data. Continue until you reach the end of the list (in our example the final rank is 12). Set out the data as in Table 2. Table 2 Ranking the data and calculating differences. Length/m Rank length Mass/tonnes Rank mass Difference/D D 2 1.0 1 1.5 1 0 0 1.5 2 2.3 2 0 0 2.3 3 3.6 4 1 1 3.4 4 7.1 5 1 1 4.4 5 2.6 3 2 4 4.6 6 13.3 11 5 25 6.2 7 7.5 6 1 1 7.0 8.5 11.2 8 0.5 0.25 7.0 8.5 12.1 10 1.5 2.25 8.7 10 11.0 7 3 9 10.5 11 12.0 9 2 4 12.0 12 18.2 12 0 0 0 47.5 From now on we work with the ranks, not the original data. This means you can use this test on interval or ordinal data (see Maths/stats support 9 What test should I use? (M9.09S)). 2: Work out the differences and square them Subtract one rank from the other, giving the difference in ranks. This has been done in the fifth column (D). Then square every difference (D 2 ). You can see this has been done in the sixth column of the table above. This sheet may have been altered from the original. 3of 5

3: Calculate r s Add up the D 2 column ( D 2 ). D 2 47.5 6( D 2 ) Calculate r s from this formula: r s 1 n(n 2 1) where r s Spearman s rank correlation coefficient sum of D Difference n number of pairs of data For the example above, we get: 6(47.5) r s 1 12(12 2 1) 1 0.166 0.834. This is an answer close to 1; hence we suspect that length and weight are positively correlated. Even more exciting, we can use r s as a hypothesis-testing statistic. Every time you use this test your null hypothesis will be: There is no correlation between the two sets of data. To know whether to accept or reject this statement you must compare your calculated value of r s with the critical value obtained from the appropriate table of critical values (Table 3 on page 5). Reject your null hypothesis if your calculated value is bigger than or equal to the critical value at the chosen significance level for your number of pairs of data (n). We have 12 pairs of length/mass data so our critical value lies along the n 12 row. Next we decide at what significance level we want to accept or reject the null hypothesis. Usually with biological data a significance level of 5% (p 0.05) is deemed acceptable (although there is no law about this so you can pick a different one if you have a good reason to; however you must know what it means). In our case the critical value at 5% significance is 0.591. Our value is bigger than this so we reject the null hypothesis and say that there is a positive correlation (p 0.05). The critical value for a 1% significance level is 0.777; our calculated value is also greater than this so we can reject the null hypothesis at this significance level too and say that there is positive correlation at the 1% significance level. In other words, the chance that there isn t a correlation between blue whale length and mass is less than one in a hundred. This sheet may have been altered from the original. 4of 5

Table 3 Critical values of Spearman s rank correlation coefficient. Significance level Number of 10% 5% 2% 1% pairs/n 5 0.900 1.000 1.000 6 0.829 0.886 0.943 1.000 7 0.714 0.786 0.893 0.929 8 0.643 0.738 0.833 0.881 9 0.600 0.683 0.783 0.833 10 0.564 0.648 0.746 0.794 12 0.506 0.591 0.712 0.777 14 0.456 0.544 0.645 0.715 16 0.425 0.506 0.601 0.665 18 0.399 0.475 0.564 0.625 20 0.377 0.450 0.534 0.591 22 0.359 0.428 0.508 0.562 24 0.343 0.409 0.485 0.537 26 0.329 0.392 0.465 0.515 28 0.317 0.377 0.448 0.496 30 0.306 0.364 0.432 0.478 This sheet may have been altered from the original. 5of 5