Percentiles One way to look at quartile points is to say that, for a sorted list of values, Q 1 is the value that has 25% of the rest of the values

Similar documents
Per capita represents the average amount or value per person, such as per capita income. Per capita figures are to make comparisons.

LINEAR COMBINATIONS AND COMPOSITE GROUPS

December 31 st, décembre 2015

Copyright 2005 Pearson Education, Inc. Slide 6-1

Chapter 15: Graphs, Charts, and Numbers Math 107

What Matters Most? A Community Forum to Discuss School District Funding Priorities

MA 1125 Lecture 05 - Measures of Spread. Wednesday, September 6, Objectives: Introduce variance, standard deviation, range.

Chapter 4-Describing Data: Displaying and Exploring Data

NOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS

Chapter 3 Descriptive Statistics: Numerical Measures Part A

Statistics 511 Supplemental Materials

Descriptive Statistics (Devore Chapter One)

Chapter 4-Describing Data: Displaying and Exploring Data

Is Utah Really a Low-Wage State?

College Debt. Some of students could not able to continue college education because of their family

2016 Review. U.S. Value Equity EQ (Gross) +16.0% -5.0% +14.2% +60.7% +19.7% -0.2% +25.2% +80.0% %

Common Compensation Terms & Formulas

STAB22 section 1.3 and Chapter 1 exercises

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

The Range, the Inter Quartile Range (or IQR), and the Standard Deviation (which we usually denote by a lower case s).

Statistical Literacy & Data Analysis

Respondent name: Sample Health Care Company name: Info-Tech Respondant Executive Summary

Find Private Lenders Now CHAPTER 10. At Last! How To. 114 Copyright 2010 Find Private Lenders Now, LLC All Rights Reserved

The Value Line Ranking System

Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need.

Describing Data: Displaying and Exploring Data

Chapter 3. Lecture 3 Sections

Module 4. Table of Contents

ECON 214 Elements of Statistics for Economists

paying off student loans

DROP Plan Design and Investment Considerations. David Kent, FSA, MAAA Ryan Miller, ASA, MAAA

3.1 Measures of Central Tendency

Telephone preference service

2012 US HIGH YIELD MARKET OUTLOOK

The Value Line Ranking System

NOTES: Chapter 4 Describing Data

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

Statistics vs. statistics

Copyright Quantext, Inc

Middle School Lesson 1. Lesson 1 Why Save? Middle School L EARNING, EARNING AND I NVESTING, NATIONAL C OUNCIL ON E CONOMIC E DUCATION, NEW YORK, NY 1

Introduction. What exactly is the statement of cash flows? Composing the statement

PART I. History - the purpose of the Amendments to the law

Purchase Price Allocation, Goodwill and Other Intangibles Creation & Asset Write-ups

Many companies in the 80 s used this milking philosophy to extract money from the company and then sell it off to someone else.

Infinite Banking How it Works By Gary Vande Linde

Today s plan: Section 4.1.4: Dispersion: Five-Number summary and Standard Deviation.

Memorandum. To: From:

A Housing Price Collapse in Queens New York Is Almost Certain

SMART MONEY MANAGEMENT

What Works On Wall Street Chapter 14 Case Study: Combining the Financial Strength Factors into a Single Composite Factor

Chapter 6: The Art of Strategy Design In Practice

TOP 10 TIPS TO PROTECT YOUR

Credit Union Members Focus Groups. Executive Summary

Putting Things Together Part 1

Introduction to Investing

Chapter 8 Statistical Intervals for a Single Sample

We use probability distributions to represent the distribution of a discrete random variable.

Standardized Data Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis

Using a Credit Card. Name Date

FINALS REVIEW BELL RINGER. Simplify the following expressions without using your calculator. 1) 6 2/3 + 1/2 2) 2 * 3(1/2 3/5) 3) 5/ /2 4

Analysis of County Business-Type Funds

Basic Procedure for Histograms

Managerial Accounting Prof. Dr. Varadraj Bapat Department School of Management Indian Institute of Technology, Bombay

GLOBAL INVESTMENT REPORTING. CSAM Swiss Pension Fund Index 4 th Quarter 2003

Survey Conducted: November 28 - December 3,

Oral History Program Series: Civil Service Interview no.: S11

Annuity Owner Mistakes

7 THE CENTRAL LIMIT THEOREM

The Easiest Way To Make Money In Real Estate

NEC: AN EARLY WARNING OF NEC4 S CHANGES TO THE EARLY WARNING CLAUSE

Compound interest is interest calculated not only on the original principal, but also on any interest that has already been earned.

Trends in Financial Literacy

Chapter 6. y y. Standardizing with z-scores. Standardizing with z-scores (cont.)

MR. MUHAMMAD AZEEM - PAKISTAN

Comparative Analysis of Different Banks

Learner Outcomes. Target Audience. Materials. Timing. Want more background and training tips? Invest Well The Basics of Investments. Teens.

FWA HSBC Financial Backpack Program

Social Studies Coalition of Delaware Signature Lesson: Economics 2, Grades 4-5. The Business of Banking by Jeanine Moore, Indian River School District

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

LOCATIONAL MARGINAL PRICING (LMP) IN ELECTRICITY MARKETS (WILEY - IEEE) BY ZUYI LI

DON'T WRECK. 10 Steps to Protect Yourself After a Car Crash. A free publication by the Law Offices of James Scott Farrin

Opportunities Assessment Report

What type of distribution is this? tml

FF hoped momentum would go away, but it didn t, so the standard factor model became the four-factor model, = ( )= + ( )+ ( )+ ( )+ ( )

Budgets in Higher Education -- The Keys to Successful Financial Management

Description of Data I

Module 4 Introduction Programme. Attitude to risk

Prepared Testimony for the Connecticut Joint Committee on Finance, Revenue, and Bonding

appstats5.notebook September 07, 2016 Chapter 5

Statistics 114 September 29, 2012

The 2008 Statistics on Income, Poverty, and Health Insurance Coverage by Gary Burtless THE BROOKINGS INSTITUTION

Math 2200 Fall 2014, Exam 1 You may use any calculator. You may not use any cheat sheet.

Get Your Worry Free Retirement Kit

GOALS. Describing Data: Displaying and Exploring Data. Dot Plots - Examples. Dot Plots. Dot Plot Minitab Example. Stem-and-Leaf.

Wisdom from Edwin Coppock

Chapter 8 Estimation

Investing. Managing Risk Time and Diversification

In a moment, we will look at a simple example involving the function f(x) = 100 x

Transcription:

Percentiles One way to look at quartile points is to say that, for a sorted list of values, Q 1 is the value that has 25% of the rest of the values that are less than it, Q 2 is the value that has 50% of the values that are less than it, and Q 3 is the value that has 75% of the values that are less than it. Along with Q 0, the lowest value, and Q 4, the highest value, these quartile points give us an idea of, a feeling for, the spread of the values. It is reasonable to ask if there is something special about using these points, 25%, 50%, and 75%, rather than some other points. Why not use 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, and 90%? The answer is that with just the few quartile points we get a good feel for the spread with just a few values, whereas, with the nine points just listed, although we would get a much better idea of the distribution of values, we would have to find and consider many more points. Just how hard would it be to find such points? We have dealt with the case of a list of ages for the 12,608 students at the college in a particular term. If we have those ages in a sorted list, then to find the 10% point we need to find the 1260.8 th item, starting from the lowest value. There is no 1260.8 th item but there is the 1260 th item, and the 1261 st item. The 1260 th item actually has less than 10% of the values before it in the sorted list, even if we include the 1260 th item in that sub-list. There are 1260 such values and 1260/12608=9. 99365%. The 1261 st item certainly has at least 10% of the values before it in the list (as long as we include the 1261 st item in that sub-list): there are 1261 such values and 1261/12608=10.0016%. Since we want to identify the first point in the list of sorted values that has 10% of the values less than or equal to it, we will

choose the 1261 st item. We see that we can get the value 1261 by taking 10% of the number of items, in this case that is 10% of 12608, giving 1260.8, and rounding up to the next higher whole number. It is nice to have such a rule, but as we have seen before, in a large data list either of our two proposed items would be the same value. In fact, for the case at hand, items 1260 and 1261 are both instances of age 18, an age that fills positions 640 to 1831 in the sorted list. The rule is to take the desired percentage of the total number of items and then round up to the next higher whole number. Doing that for the rest of the percentages that we have gives: Percent Position expression Position value Rounded position Value at the position 10% 10% of 12608 1260.8 1261 18 20% 20% of 12608 2521.6 2522 19 30% 30% of 12608 3782.4 3783 20 40% 40% of 12608 5043.2 5044 21 50% 50% of 12608 6304 6304 23 60% 60% of 12608 7564.8 7565 26 70% 70% of 12608 8525.6 8526 28 80% 80% of 12608 10086.4 10087 34 90% 90% of 12608 11347.2 11348 44 Go back to the 50% value. We know this is the median. With an even number of items we would compute the median as the average of item 6304 and 6305. The computation for the 50%

point gave us 6304 which we decided to leave as that instead of rounding up to the next higher whole number, 6305. If we strictly apply the rule we should have rounded up, but, as usual, with a large data list it would not have made any difference given that both the 6304 th and 6305 th items are age 23. There are two more issue; we will define the 0 th percent point as the lowest value and we will define the 100 th percent point as the highest value. The rule works for the 0% point since 0% of 12608 is 0 and we could round that up to 1 and the 1 st item in the sorted list is the lowest value. The rule does not work for the 100% point since 100% of 12608 is 12608 and if we were to round that up to 12609 we would be in trouble since there is no 12609 th value. Getting the eleven values that we just found, from 0% to 100% in steps of 10%, is nice, and it gives us a better feel for the distribution of ages in the data list, but it does so at the cost of having many more values to examine than we had with the simple quartile experience. All of this is nice but it merely sets the stage for what we really want, namely percentiles. To do percentiles we simply expand our process to go from 0% to 100% but this time to do so in steps of 1%. We follow the same rule. To find the 73%tile we find 73% of 12608, namely 9203.84, round that up to the next higher whole number, 9204, and look at the value in position 9204 in the sorted list, for us that will be age 30. Our statement is that the 73 rd percentile of the list is age 30, or, more commonly, 30 is the 73 rd percentile of the list. You might

note that this does not really mean that there are 73% of the values that are less than age 30. In fact, there are only 9042 age items less than 30 and 9024/12608=71.72%. However, there are 9303 values that are age 30 or less, representing 73.79% of all the values, so our 73 rd percentile point is one of the 271 ages that are all 30. In common usage, we would still say that 30 is the 73 rd percentile, and we would interpret that to mean that 73% of the values are less than 30. Clearly this is wrong, but it is what is generally done. What if we want to go in the other direction? Let us say that we have the set of values, that we are looking at one of those values and we want to assign a percentile to that value. In general, percentiles are given to whole number values. If we are looking at age 37 in our large list example, then we find that age 37 occupies positions 10448 through 10586, representing percentages 10448/12608=82.868% to 10586/12608=83.963%. It seems that it is safe to say that 37 is in the 83 rd percentile. On the other hand, what if we want to give a percentile ranking to age 25? Age 25 holds the positions 7286 through 7742 in our sorted list. This corresponds to percentages of 7286/12608=57.789% to 7742/12608=61.405%. That is to say, we have age 25 items that are at the 58 th percentile, 59 th percentile, 60 th percentile, and even 61 st percentile. What should we say about age 25 in general? The safest thing to say is that 25 occupies all four percentiles. There is no standard on what to say beyond that. We saw that quartiles give us a feel for the data. Expanding the number of points (originally we went from 3 to 9) gave us a

better feel, but at the expense of having to consider many more points. Expanding the number of points to percentages (now we are up to 101 points if we include the endpoints) gives us a really good feel for the data, but that is just too many points to hold in our mind at once. Percentiles, however, when given for a specific value, such as 37 is in the 83 rd percentile, gives a feel for where that specific value resides in the ordered list of all values.