Handout 5: Summarizing Numerical Data STAT 100 Spring 2016

Size: px
Start display at page:

Download "Handout 5: Summarizing Numerical Data STAT 100 Spring 2016"

Transcription

1 In this handout, we will consider methods that are appropriate for summarizing a single set of numerical measurements. Definition Numerical Data: A set of measurements that are recorded on a naturally numeric scale. Example: Typical Household Income The Census Bureau provides a variety of information at the county level for all counties across the U.S. in its State & County QuickFacts data sets ( In this handout, we will consider one numerical variable that was measured in each county: Typical Household Income. These data can be found in the file CensusData.xlsx on the course website. A portion of the data set is shown below. Note that I have used the Filter option to work with only the counties in Minnesota. To do this, highlight the State column and then select Filter from the Data tab. A drop-down arrow should now appear in the State column. Click on this and then select the appropriate box to show only the values from Minnesota. A portion of the results are shown below. It s difficult to gain insight into the typical household incomes across counties in Minnesota by viewing the data in the above table, so let s alternatively consider a few graphical representations which summarize these measurements. Which do you prefer, and why? 1

2 MEASURES OF LOCATION: CENTER Suppose we wanted to summarize the location (on a number line) of the data that were measured on typical household incomes in Minnesota counties. The most common measures of location summarize the center of a data set: the mean and the median. Definitions Mean: The arithmetic average of all values. This is calculated by adding up all of the values and dividing by the total number of measurements. Median: This is the middle value of a data set, after the numerical values have been put in order. If the data set contains an even number of observations, then the median is the average of the middle two observations. Getting these Summaries in Excel: In Excel, you can calculate the mean and median using the following functions: Summary Excel Function Mean =AVERAGE( ) Median =MEDIAN( ) Enter the following in your Excel spreadsheet to calculate these summaries: Write the values that Excel returns in Column N on the spreadsheet shown below. 2

3 Excel Tip You can name a range of cells to be referenced later in a formula. Do this by highlighting the data values in a specific range (in this case, highlight the Typical Household Income values for Minnesota counties) and then giving the data range a name in the box just above the column labels. For example, I chose to name the range of data containing Typical Household Income for Minnesota counties MN_Income in the worksheet shown below. This range can now be referenced in a formula using only its name: 3

4 The mean (or average) is the balance point in the distribution. The median is the middle value. Since there are 87 measurements in this data set, the middle value would be the 44 th of the ordered measurements. Questions: 1. Does the mean necessarily have to be a value in the data set? Explain. 2. Does the median necessarily have to be a value in the data set? Explain. 3. In Minnesota, the typical household income is highest in Scott County ($83,415). Suppose this data value was replaced by a value that was even larger (say $90,000). What effect would this have on the mean Typical Household Income across Minnesota counties? What about the median? 4. Note that in our data set, each county is represented by a single value for Typical Household Income. How do you think the U.S. Census Bureau came up with this one measurement for each county? Discuss. 4

5 MEASURES OF LOCATION: PERCENTILES In addition to the mean and/or median, summaries called percentiles are also used to describe a set of measurements. These percentiles give us insight into the entire spectrum of data values. Definition Percentile: The p th percentile of a set of measurements is defined to be the point in the data set where p% of the measurements fall at or below that value. To see how percentiles are calculated, consider the county level Typical Household Income in Minnesota counties. A graph of these values is shown below. One way to understand percentiles is to find the percentage of observations that fall at or below a particular point in the data set. For example, note that about 3% of the counties in Minnesota have typical income levels below $40,000. So, the 3 rd percentile of this data set is about $40,000. Typical Household Income Percentiles $35,000 0% $40,000 3% $45,000 26% $50,000 64% $55,000 79% $60,000 87% $65,000 89% $70,000 94% $75,000 97% $80,000 98% $85, % 5

6 A cumulative density function (CDF) plot can be used to display all of these percentiles. To create a CDF plot on the graph below, plot the typical household income levels from the preceding table on the x-axis and their respective percentiles on the y-axis. Next, instead of first selecting a Typical Household Income value and then calculating the percentage of data points at or below that point, we could work backwards. In other words, we could define certain percentiles and then determine the Typical Household Income level for that percentile. For example, the bottom 10% of the incomes falls at or below $43,285. So, the 10 th percentile for this data set is $43,285. Income per household Percentiles $35,307 0% $43,285 10% $44,472 20% Q 1 - $44,820 25% $45,475 30% $46,960 40% Median - $47,959 50% $49,420 60% $51,987 70% Q 3 - $52,598 75% $55,590 80% $66,208 90% $83, % 6

7 Percentiles can be calculated in Excel using the =PERCENTILE( ) function. For example, in the worksheet shown below, I entered the following formulas in Column N so that Excel would return those percentiles in Column N. Excel returns the percentiles as shown below: 7

8 Note that the CDF plot based on the table we just obtained in Excel is equivalent to the one sketched earlier. Questions: Use the preceding table of percentiles and/or the corresponding CDF plot to answer the following questions. 1. What is the median Typical Household Income for MN counties? 2. How could you determine the median from the CDF plot? Discuss. 3. What is the minimum Typical Household Income in MN? What is the maximum? How could you identify these from the CDF plot? Discuss. 4. The CDF plot has a longer tail on the upper-end than on the lower-end. What does this imply about Typical Household Income across the 87 counties in MN? Discuss. 5. The CDF plot is fairly steep between $45,000 and $55,000. What does this imply about Typical Household Income across the 87 counties in MN? Discuss. 8

9 IS A MEASURE OF CENTER ENOUGH? Note that by calculating a summary such as a mean or median for a data set, we condense information from all of the measurements down to a single value. For example, consider the Typical Household Income across counties for three different states (Minnesota, Wisconsin, and Virginia): The following picture shows the average for each state. Questions: 1. What differences exist in the Typical Household Income values across these three states? Discuss. 2. Suppose that your friend tries to summarize the differences across these three states using only the mean (i.e., average) from each state. Do you think that this single summary (the mean) tells the whole story well? Why or why not? 9

10 Note that instead of simply comparing only the the means from each state, we could have also considered percentiles. Putting the CDF plots for all three states on the same graph allows us to make very rich comparisons across states. Questions: 1. Consider the poorest people in each state. In which state do the county-level typical household incomes tend to be the lowest? Likewise, consider the richest people in each state. In which state are the county-level typical household incomes the highest? 2. Which state seems to have the most problems with income inequality? Discuss. 3. How do the income levels of MN and WI compare? 4. What is an advantage to using the CDF plot to make comparisons across states? 10

11 MEASURES OF VARIABILITY As noted above, describing an entire data set well involves more than simply summarizing its center with a mean or a median. We should also consider the amount of variability in the data set (i.e., a measure of how different the measurements are from one another). Several quantities exist for summarizing the amount of variability. A few of them are discussed below. Definition Range: The difference between the largest and smallest measurements in a data set. For example, consider the Minnesota counties Typical Household Incomes: In Excel, you can calculate this using the =MAX( ) and =MIN( ) functions: Write the value that Excel returns on the spreadsheet below: 11

12 Definition Mean Absolute Deviation: For each measurement, calculate how far away that measurement is from the mean of the data set. The mean absolute deviation is the average of these absolute distances. MAD Distanceto the me an n To see how this is calculated, first consider the mean for Minnesota: Then, consider the distance between each measurement and the mean: Next, we consider the length of each of these distances (also called the absolute value of the residuals) on the following plot. The average of this data set is the mean absolute deviation. 12

13 In Excel, you can use the =AVEDEV( ) function to calculate the mean absolute deviation: Definition Standard Deviation: Like the mean absolute deviation, the standard deviation also measures the typical distance from the mean. For each value in the data set, we calculate how far away that measurement is from the mean of the data set. The standard deviation is a function of these squared distances. StandardDe viation Distanceto the me an n You can use the =STDEV( ) function to calculate this in Excel: Write the value that Excel returns on the spreadsheet below: 13

14 Use Excel to compute the following summaries for each of the three states listed below. Then, answer the questions that follow the table. Virginia Wisconsin Minnesota Mean Minimum 25 th percentile Median 75 th percentile Maximum Range MAD Standard Deviation 14

15 Questions: 1. In which state does the Typical Household Income of counties tend to be the highest? The lowest? Discuss. 2. Which state appears to have the most problems with income inequality? Which state appears to have the least problems with income inequality? How did you decide this? 3. Virginia s data set consists of 134 counties, while Wisconsin s consists of only 72. Your friend argues that there is less variability (and therefore less income inequality) in Wisconsin s data set simply because it has a smaller number of measurements. Why is this reasoning incorrect? What is the real reason there is less variability in Wisconsin than in Virginia? 15

Descriptive Statistics

Descriptive Statistics Chapter 3 Descriptive Statistics Chapter 2 presented graphical techniques for organizing and displaying data. Even though such graphical techniques allow the researcher to make some general observations

More information

You should already have a worksheet with the Basic Plus Plan details in it as well as another plan you have chosen from ehealthinsurance.com.

You should already have a worksheet with the Basic Plus Plan details in it as well as another plan you have chosen from ehealthinsurance.com. In earlier technology assignments, you identified several details of a health plan and created a table of total cost. In this technology assignment, you ll create a worksheet which calculates the total

More information

Statistics vs. statistics

Statistics vs. statistics Statistics vs. statistics Question: What is Statistics (with a capital S)? Definition: Statistics is the science of collecting, organizing, summarizing and interpreting data. Note: There are 2 main ways

More information

Handout 3 More on the National Debt

Handout 3 More on the National Debt Handout 3 More on the National Debt In this handout, we are going to continue learning about the national debt and you ll learn how to use Excel to perform simple summaries of the information. One of my

More information

DECISION SUPPORT Risk handout. Simulating Spreadsheet models

DECISION SUPPORT Risk handout. Simulating Spreadsheet models DECISION SUPPORT MODELS @ Risk handout Simulating Spreadsheet models using @RISK 1. Step 1 1.1. Open Excel and @RISK enabling any macros if prompted 1.2. There are four on-line help options available.

More information

1.2 Describing Distributions with Numbers, Continued

1.2 Describing Distributions with Numbers, Continued 1.2 Describing Distributions with Numbers, Continued Ulrich Hoensch Thursday, September 6, 2012 Interquartile Range and 1.5 IQR Rule for Outliers The interquartile range IQR is the distance between the

More information

Dot Plot: A graph for displaying a set of data. Each numerical value is represented by a dot placed above a horizontal number line.

Dot Plot: A graph for displaying a set of data. Each numerical value is represented by a dot placed above a horizontal number line. Introduction We continue our study of descriptive statistics with measures of dispersion, such as dot plots, stem and leaf displays, quartiles, percentiles, and box plots. Dot plots, a stem-and-leaf display,

More information

Handout 4 numerical descriptive measures part 2. Example 1. Variance and Standard Deviation for Grouped Data. mf N 535 = = 25

Handout 4 numerical descriptive measures part 2. Example 1. Variance and Standard Deviation for Grouped Data. mf N 535 = = 25 Handout 4 numerical descriptive measures part Calculating Mean for Grouped Data mf Mean for population data: µ mf Mean for sample data: x n where m is the midpoint and f is the frequency of a class. Example

More information

appstats5.notebook September 07, 2016 Chapter 5

appstats5.notebook September 07, 2016 Chapter 5 Chapter 5 Describing Distributions Numerically Chapter 5 Objective: Students will be able to use statistics appropriate to the shape of the data distribution to compare of two or more different data sets.

More information

Prepared By. Handaru Jati, Ph.D. Universitas Negeri Yogyakarta.

Prepared By. Handaru Jati, Ph.D. Universitas Negeri Yogyakarta. Prepared By Handaru Jati, Ph.D Universitas Negeri Yogyakarta handaru@uny.ac.id Chapter 7 Statistical Analysis with Excel Chapter Overview 7.1 Introduction 7.2 Understanding Data 7.2.1 Descriptive Statistics

More information

Spreadsheet Directions

Spreadsheet Directions The Best Summer Job Offer Ever! Spreadsheet Directions Before beginning, answer questions 1 through 4. Now let s see if you made a wise choice of payment plan. Complete all the steps outlined below in

More information

Lecture 2 Describing Data

Lecture 2 Describing Data Lecture 2 Describing Data Thais Paiva STA 111 - Summer 2013 Term II July 2, 2013 Lecture Plan 1 Types of data 2 Describing the data with plots 3 Summary statistics for central tendency and spread 4 Histograms

More information

Standardized Data Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis

Standardized Data Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis Descriptive Statistics (Part 2) 4 Chapter Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis McGraw-Hill/Irwin Copyright 2009 by The McGraw-Hill Companies, Inc. Chebyshev s Theorem

More information

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment Math 2311 Bekki George bekki@math.uh.edu Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment Class webpage: http://www.math.uh.edu/~bekki/math2311.html Math 2311 Class

More information

1 Describing Distributions with numbers

1 Describing Distributions with numbers 1 Describing Distributions with numbers Only for quantitative variables!! 1.1 Describing the center of a data set The mean of a set of numerical observation is the familiar arithmetic average. To write

More information

Descriptive Statistics

Descriptive Statistics Petra Petrovics Descriptive Statistics 2 nd seminar DESCRIPTIVE STATISTICS Definition: Descriptive statistics is concerned only with collecting and describing data Methods: - statistical tables and graphs

More information

DATA SUMMARIZATION AND VISUALIZATION

DATA SUMMARIZATION AND VISUALIZATION APPENDIX DATA SUMMARIZATION AND VISUALIZATION PART 1 SUMMARIZATION 1: BUILDING BLOCKS OF DATA ANALYSIS 294 PART 2 PART 3 PART 4 VISUALIZATION: GRAPHS AND TABLES FOR SUMMARIZING AND ORGANIZING DATA 296

More information

Example: Histogram for US household incomes from 2015 Table:

Example: Histogram for US household incomes from 2015 Table: 1 Example: Histogram for US household incomes from 2015 Table: Income level Relative frequency $0 - $14,999 11.6% $15,000 - $24,999 10.5% $25,000 - $34,999 10% $35,000 - $49,999 12.7% $50,000 - $74,999

More information

Applications of Data Dispersions

Applications of Data Dispersions 1 Applications of Data Dispersions Key Definitions Standard Deviation: The standard deviation shows how far away each value is from the mean on average. Z-Scores: The distance between the mean and a given

More information

Computing interest and composition of functions:

Computing interest and composition of functions: Computing interest and composition of functions: In this week, we are creating a simple and compound interest calculator in EXCEL. These two calculators will be used to solve interest questions in week

More information

Introduction to Basic Excel Functions and Formulae Note: Basic Functions Note: Function Key(s)/Input Description 1. Sum 2. Product

Introduction to Basic Excel Functions and Formulae Note: Basic Functions Note: Function Key(s)/Input Description 1. Sum 2. Product Introduction to Basic Excel Functions and Formulae Excel has some very useful functions that you can use when working with formulae. This worksheet has been designed using Excel 2010 however the basic

More information

DazStat. Introduction. Installation. DazStat is an Excel add-in for Excel 2003 and Excel 2007.

DazStat. Introduction. Installation. DazStat is an Excel add-in for Excel 2003 and Excel 2007. DazStat Introduction DazStat is an Excel add-in for Excel 2003 and Excel 2007. DazStat is one of a series of Daz add-ins that are planned to provide increasingly sophisticated analytical functions particularly

More information

MATHEMATICS APPLIED TO BIOLOGICAL SCIENCES MVE PA 07. LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1)

MATHEMATICS APPLIED TO BIOLOGICAL SCIENCES MVE PA 07. LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1) LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1) Descriptive statistics are ways of summarizing large sets of quantitative (numerical) information. The best way to reduce a set of

More information

Financial Econometrics

Financial Econometrics Financial Econometrics Value at Risk Gerald P. Dwyer Trinity College, Dublin January 2016 Outline 1 Value at Risk Introduction VaR RiskMetrics TM Summary Risk What do we mean by risk? Dictionary: possibility

More information

Session Window. Variable Name Row. Worksheet Window. Double click on MINITAB icon. You will see a split screen: Getting Started with MINITAB

Session Window. Variable Name Row. Worksheet Window. Double click on MINITAB icon. You will see a split screen: Getting Started with MINITAB STARTING MINITAB: Double click on MINITAB icon. You will see a split screen: Session Window Worksheet Window Variable Name Row ACTIVE WINDOW = BLUE INACTIVE WINDOW = GRAY f(x) F(x) Getting Started with

More information

Technology Assignment Calculate the Total Annual Cost

Technology Assignment Calculate the Total Annual Cost In an earlier technology assignment, you identified several details of two different health plans. In this technology assignment, you ll create a worksheet which calculates the total annual cost of medical

More information

WEB APPENDIX 8A 7.1 ( 8.9)

WEB APPENDIX 8A 7.1 ( 8.9) WEB APPENDIX 8A CALCULATING BETA COEFFICIENTS The CAPM is an ex ante model, which means that all of the variables represent before-the-fact expected values. In particular, the beta coefficient used in

More information

. (i) What is the probability that X is at most 8.75? =.875

. (i) What is the probability that X is at most 8.75? =.875 Worksheet 1 Prep-Work (Distributions) 1)Let X be the random variable whose c.d.f. is given below. F X 0 0.3 ( x) 0.5 0.8 1.0 if if if if if x 5 5 x 10 10 x 15 15 x 0 0 x Compute the mean, X. (Hint: First

More information

Statistics, Measures of Central Tendency I

Statistics, Measures of Central Tendency I Statistics, Measures of Central Tendency I We are considering a random variable X with a probability distribution which has some parameters. We want to get an idea what these parameters are. We perfom

More information

Form 162. Form 194. Form 239

Form 162. Form 194. Form 239 Below is a list of topics that we receive calls about each year with the solutions to them detailed. New features and funds have also been added. Note: Some of the topics have more than one question so

More information

PDQ-Notes Reynolds Farley

PDQ-Notes Reynolds Farley PDQ-Notes Reynolds Farley PDQ-Note 7 Quantiles and Medians PDQ-Note 7 Quantiles and Medians The mean of a distribution is an excellent measure of central tendency. If we sum the years of age reported by

More information

A LEVEL MATHEMATICS ANSWERS AND MARKSCHEMES SUMMARY STATISTICS AND DIAGRAMS. 1. a) 45 B1 [1] b) 7 th value 37 M1 A1 [2]

A LEVEL MATHEMATICS ANSWERS AND MARKSCHEMES SUMMARY STATISTICS AND DIAGRAMS. 1. a) 45 B1 [1] b) 7 th value 37 M1 A1 [2] 1. a) 45 [1] b) 7 th value 37 [] n c) LQ : 4 = 3.5 4 th value so LQ = 5 3 n UQ : 4 = 9.75 10 th value so UQ = 45 IQR = 0 f.t. d) Median is closer to upper quartile Hence negative skew [] Page 1 . a) Orders

More information

$0.00 $0.50 $1.00 $1.50 $2.00 $2.50 $3.00 $3.50 $4.00 Price

$0.00 $0.50 $1.00 $1.50 $2.00 $2.50 $3.00 $3.50 $4.00 Price Orange Juice Sales and Prices In this module, you will be looking at sales and price data for orange juice in grocery stores. You have data from 83 stores on three brands (Tropicana, Minute Maid, and the

More information

Chapter 4-Describing Data: Displaying and Exploring Data

Chapter 4-Describing Data: Displaying and Exploring Data Chapter 4-Describing Data: Displaying and Exploring Data Jie Zhang, Ph.D. Student Account and Information Systems Department College of Business Administration The University of Texas at El Paso jzhang6@utep.edu

More information

2 Exploring Univariate Data

2 Exploring Univariate Data 2 Exploring Univariate Data A good picture is worth more than a thousand words! Having the data collected we examine them to get a feel for they main messages and any surprising features, before attempting

More information

3.1 Measures of Central Tendency

3.1 Measures of Central Tendency 3.1 Measures of Central Tendency n Summation Notation x i or x Sum observation on the variable that appears to the right of the summation symbol. Example 1 Suppose the variable x i is used to represent

More information

starting on 5/1/1953 up until 2/1/2017.

starting on 5/1/1953 up until 2/1/2017. An Actuary s Guide to Financial Applications: Examples with EViews By William Bourgeois An actuary is a business professional who uses statistics to determine and analyze risks for companies. In this guide,

More information

How Wealthy Are Europeans?

How Wealthy Are Europeans? How Wealthy Are Europeans? Grades: 7, 8, 11, 12 (course specific) Description: Organization of data of to examine measures of spread and measures of central tendency in examination of Gross Domestic Product

More information

Practical Session 8 Time series and index numbers

Practical Session 8 Time series and index numbers 20880 21186 21490 21794 22098 22402 22706 23012 23316 23621 23924 24228 24532 24838 25143 25447 25750 26054 26359 26665 26969 27273 27576 Practical Session 8 Time series and index numbers In this session,

More information

MLC at Boise State Lines and Rates Activity 1 Week #2

MLC at Boise State Lines and Rates Activity 1 Week #2 Lines and Rates Activity 1 Week #2 This activity will use slopes to calculate marginal profit, revenue and cost of functions. What is Marginal? Marginal cost is the cost added by producing one additional

More information

An application program that can quickly handle calculations. A spreadsheet uses numbers like a word processor uses words.

An application program that can quickly handle calculations. A spreadsheet uses numbers like a word processor uses words. An application program that can quickly handle calculations A spreadsheet uses numbers like a word processor uses words. WHAT IF? Columns run vertically & are identified by letters A, B, etc. Rows run

More information

Bidding Decision Example

Bidding Decision Example Bidding Decision Example SUPERTREE EXAMPLE In this chapter, we demonstrate Supertree using the simple bidding problem portrayed by the decision tree in Figure 5.1. The situation: Your company is bidding

More information

Creating a Rolling Income Statement

Creating a Rolling Income Statement Creating a Rolling Income Statement This is a demonstration on how to create an Income Statement that will always return the current month s data as well as the prior 12 months data. The report will be

More information

Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc.

Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc. Chapter 8 Measures of Center Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc. Data that can only be integer

More information

Common Compensation Terms & Formulas

Common Compensation Terms & Formulas Common Compensation Terms & Formulas Common Compensation Terms & Formulas ERI Economic Research Institute is pleased to provide the following commonly used compensation terms and formulas for your ongoing

More information

Basic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 3: April 25, Abstract

Basic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 3: April 25, Abstract Basic Data Analysis Stephen Turnbull Business Administration and Public Policy Lecture 3: April 25, 2013 Abstract Review summary statistics and measures of location. Discuss the placement exam as an exercise

More information

MBEJ 1023 Dr. Mehdi Moeinaddini Dept. of Urban & Regional Planning Faculty of Built Environment

MBEJ 1023 Dr. Mehdi Moeinaddini Dept. of Urban & Regional Planning Faculty of Built Environment MBEJ 1023 Planning Analytical Methods Dr. Mehdi Moeinaddini Dept. of Urban & Regional Planning Faculty of Built Environment Contents What is statistics? Population and Sample Descriptive Statistics Inferential

More information

MLC at Boise State Polynomials Activity 2 Week #3

MLC at Boise State Polynomials Activity 2 Week #3 Polynomials Activity 2 Week #3 This activity will discuss rate of change from a graphical prespective. We will be building a t-chart from a function first by hand and then by using Excel. Getting Started

More information

NOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS

NOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS NOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS A box plot is a pictorial representation of the data and can be used to get a good idea and a clear picture about the distribution of the data. It shows

More information

Data screening, transformations: MRC05

Data screening, transformations: MRC05 Dale Berger Data screening, transformations: MRC05 This is a demonstration of data screening and transformations for a regression analysis. Our interest is in predicting current salary from education level

More information

Chapter 4-Describing Data: Displaying and Exploring Data

Chapter 4-Describing Data: Displaying and Exploring Data Chapter 4-Describing Data: Displaying and Exploring Data Jie Zhang, Ph.D. Student Account and Information Systems Department College of Business Administration The University of Texas at El Paso jzhang6@utep.edu

More information

Monte Carlo Simulation (Random Number Generation)

Monte Carlo Simulation (Random Number Generation) Monte Carlo Simulation (Random Number Generation) Revised: 10/11/2017 Summary... 1 Data Input... 1 Analysis Options... 6 Summary Statistics... 6 Box-and-Whisker Plots... 7 Percentiles... 9 Quantile Plots...

More information

2.2: The Lorenz Curve

2.2: The Lorenz Curve 2.2: The Lorenz Curve Objectives: After the lessons, students will be able to 1. identify the boundaries of a Lorenz Curve; 2. understand the presentation of income inequality in a Lorenz Curve; 3. understand

More information

Scheduled Pension Payments to County Retirees Mendocino County Employees Retirement Association

Scheduled Pension Payments to County Retirees Mendocino County Employees Retirement Association STATISTICAL ANALYSIS Scheduled Pension Payments to County Retirees Mendocino County Employees Retirement Association April 1, 2010 - March 31, 2011 May 25, 2010 The Mendocino County Farm Bureau (MCFB)

More information

Excel Build a Salary Schedule 03/15/2017

Excel Build a Salary Schedule 03/15/2017 EXCEL BUILDING A SALARY SCHEDULE WEDNESDAY, 3/22/2017 3:15 PM TRACY S. LEED ACCOUNTS PAYABLE SUPERVISOR/SYSTEM ADMINISTRATOR CHESTER CO INTERMEDIATE UNIT 24 DOWNINGTOWN, PA PASBO 62 ND ANNUAL CONFERENCE

More information

Math 14, Homework 6.2 p. 337 # 3, 4, 9, 10, 15, 18, 19, 21, 22 Name

Math 14, Homework 6.2 p. 337 # 3, 4, 9, 10, 15, 18, 19, 21, 22 Name Name 3. Population in U.S. Jails The average daily jail population in the United States is 706,242. If the distribution is normal and the standard deviation is 52,145, find the probability that on a randomly

More information

An Excel Modeling Practice Problem

An Excel Modeling Practice Problem An Excel Modeling Practice Problem Excel Review Excel 97 1999-2000 The Padgett s Widgets Problem Market research by Padgett s Widget Company has revealed that the demand for its products varies with the

More information

SUMMARY STATISTICS EXAMPLES AND ACTIVITIES

SUMMARY STATISTICS EXAMPLES AND ACTIVITIES Session 6 SUMMARY STATISTICS EXAMPLES AD ACTIVITIES Example 1.1 Expand the following: 1. X 2. 2 6 5 X 3. X 2 4 3 4 4. X 4 2 Solution 1. 2 3 2 X X X... X 2. 6 4 X X X X 4 5 6 5 3. X 2 X 3 2 X 4 2 X 5 2

More information

VARIABILITY: Range Variance Standard Deviation

VARIABILITY: Range Variance Standard Deviation VARIABILITY: Range Variance Standard Deviation Measures of Variability Describe the extent to which scores in a distribution differ from each other. Distance Between the Locations of Scores in Three Distributions

More information

STAB22 section 1.3 and Chapter 1 exercises

STAB22 section 1.3 and Chapter 1 exercises STAB22 section 1.3 and Chapter 1 exercises 1.101 Go up and down two times the standard deviation from the mean. So 95% of scores will be between 572 (2)(51) = 470 and 572 + (2)(51) = 674. 1.102 Same idea

More information

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics. Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics. Convergent validity: the degree to which results/evidence from different tests/sources, converge on the same conclusion.

More information

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1 1 Chapter 1 1.1 Definitions Stat 101 Exam 1 - Embers Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2.

More information

Further Mathematics 2016 Core: RECURSION AND FINANCIAL MODELLING Chapter 6 Interest and depreciation

Further Mathematics 2016 Core: RECURSION AND FINANCIAL MODELLING Chapter 6 Interest and depreciation Further Mathematics 2016 Core: RECURSION AND FINANCIAL MODELLING Chapter 6 Interest and depreciation Key knowledge the use of first- order linear recurrence relations to model flat rate and unit cost and

More information

XLSTAT TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING

XLSTAT TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING XLSTAT TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING INTRODUCTION XLSTAT makes accessible to anyone a powerful, complete and user-friendly data analysis and statistical solution. Accessibility to

More information

Form 155. Form 162. Form 194. Form 239

Form 155. Form 162. Form 194. Form 239 Below is a list of topics that we receive calls about each year with the solutions to them detailed. New features and funds have also been added. Note: Some of the topics have more than one question so

More information

Jacob: The illustrative worksheet shows the values of the simulation parameters in the upper left section (Cells D5:F10). Is this for documentation?

Jacob: The illustrative worksheet shows the values of the simulation parameters in the upper left section (Cells D5:F10). Is this for documentation? PROJECT TEMPLATE: DISCRETE CHANGE IN THE INFLATION RATE (The attached PDF file has better formatting.) {This posting explains how to simulate a discrete change in a parameter and how to use dummy variables

More information

Risk Analysis. å To change Benchmark tickers:

Risk Analysis. å To change Benchmark tickers: Property Sheet will appear. The Return/Statistics page will be displayed. 2. Use the five boxes in the Benchmark section of this page to enter or change the tickers that will appear on the Performance

More information

Excel Tutorial 9: Working with Financial Tools and Functions TRUE/FALSE 1. The fv argument is required in the PMT function.

Excel Tutorial 9: Working with Financial Tools and Functions TRUE/FALSE 1. The fv argument is required in the PMT function. Excel Tutorial 9: Working with Financial Tools and Functions TRUE/FALSE 1. The fv argument is required in the PMT function. ANS: F PTS: 1 REF: EX 493 2. Cash flow has nothing to do with who owns the money.

More information

Statistics 511 Supplemental Materials

Statistics 511 Supplemental Materials Gaussian (or Normal) Random Variable In this section we introduce the Gaussian Random Variable, which is more commonly referred to as the Normal Random Variable. This is a random variable that has a bellshaped

More information

Chapter 3 Discrete Random Variables and Probability Distributions

Chapter 3 Discrete Random Variables and Probability Distributions Chapter 3 Discrete Random Variables and Probability Distributions Part 2: Mean and Variance of a Discrete Random Variable Section 3.4 1 / 16 Discrete Random Variable - Expected Value In a random experiment,

More information

For 466W Forest Resource Management Lab 5: Marginal Analysis of the Rotation Decision in Even-aged Stands February 11, 2004

For 466W Forest Resource Management Lab 5: Marginal Analysis of the Rotation Decision in Even-aged Stands February 11, 2004 For 466W Forest Resource Management Lab 5: Marginal Analysis of the Rotation Decision in Even-aged Stands February 11, 2004 You used the following equation in your first lab to calculate various measures

More information

Math 1526 Summer 2000 Session 1

Math 1526 Summer 2000 Session 1 Math 1526 Summer 2 Session 1 Lab #2 Part #1 Rate of Change This lab will investigate the relationship between the average rate of change, the slope of a secant line, the instantaneous rate change and the

More information

Describing Data: One Quantitative Variable

Describing Data: One Quantitative Variable STAT 250 Dr. Kari Lock Morgan The Big Picture Describing Data: One Quantitative Variable Population Sampling SECTIONS 2.2, 2.3 One quantitative variable (2.2, 2.3) Statistical Inference Sample Descriptive

More information

Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution

Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution PSY 464 Advanced Experimental Design Describing and Exploring Data The Normal Distribution 1 Overview/Outline Questions-problems? Exploring/Describing data Organizing/summarizing data Graphical presentations

More information

Frequency Distribution and Summary Statistics

Frequency Distribution and Summary Statistics Frequency Distribution and Summary Statistics Dongmei Li Department of Public Health Sciences Office of Public Health Studies University of Hawai i at Mānoa Outline 1. Stemplot 2. Frequency table 3. Summary

More information

Basic Procedure for Histograms

Basic Procedure for Histograms Basic Procedure for Histograms 1. Compute the range of observations (min. & max. value) 2. Choose an initial # of classes (most likely based on the range of values, try and find a number of classes that

More information

Chapter 4 Continuous Random Variables and Probability Distributions

Chapter 4 Continuous Random Variables and Probability Distributions Chapter 4 Continuous Random Variables and Probability Distributions Part 2: More on Continuous Random Variables Section 4.5 Continuous Uniform Distribution Section 4.6 Normal Distribution 1 / 27 Continuous

More information

Putting Things Together Part 2

Putting Things Together Part 2 Frequency Putting Things Together Part These exercise blend ideas from various graphs (histograms and boxplots), differing shapes of distributions, and values summarizing the data. Data for, and are in

More information

2 DESCRIPTIVE STATISTICS

2 DESCRIPTIVE STATISTICS Chapter 2 Descriptive Statistics 47 2 DESCRIPTIVE STATISTICS Figure 2.1 When you have large amounts of data, you will need to organize it in a way that makes sense. These ballots from an election are rolled

More information

Some estimates of the height of the podium

Some estimates of the height of the podium Some estimates of the height of the podium 24 36 40 40 40 41 42 44 46 48 50 53 65 98 1 5 number summary Inter quartile range (IQR) range = max min 2 1.5 IQR outlier rule 3 make a boxplot 24 36 40 40 40

More information

Lecture Week 4 Inspecting Data: Distributions

Lecture Week 4 Inspecting Data: Distributions Lecture Week 4 Inspecting Data: Distributions Introduction to Research Methods & Statistics 2013 2014 Hemmo Smit So next week No lecture & workgroups But Practice Test on-line (BB) Enter data for your

More information

Describing Data: Displaying and Exploring Data

Describing Data: Displaying and Exploring Data Describing Data: Displaying and Exploring Data Chapter 4 McGraw-Hill/Irwin Copyright 2011 by the McGraw-Hill Companies, Inc. All rights reserved. LEARNING OBJECTIVES LO1. Develop and interpret a dot plot.

More information

GOALS. Describing Data: Displaying and Exploring Data. Dot Plots - Examples. Dot Plots. Dot Plot Minitab Example. Stem-and-Leaf.

GOALS. Describing Data: Displaying and Exploring Data. Dot Plots - Examples. Dot Plots. Dot Plot Minitab Example. Stem-and-Leaf. Describing Data: Displaying and Exploring Data Chapter 4 GOALS 1. Develop and interpret a dot plot.. Develop and interpret a stem-and-leaf display. 3. Compute and understand quartiles, deciles, and percentiles.

More information

Description of Data I

Description of Data I Description of Data I (Summary and Variability measures) Objectives: Able to understand how to summarize the data Able to understand how to measure the variability of the data Able to use and interpret

More information

Math of Finance Exponential & Power Functions

Math of Finance Exponential & Power Functions The Right Stuff: Appropriate Mathematics for All Students Promoting the use of materials that engage students in meaningful activities that promote the effective use of technology to support mathematics,

More information

SUPPLEMENTARY LESSON 1 DISCOVER HOW THE WORLD REALLY WORKS ASX Schools Sharemarket Game THE ASX CHARTS

SUPPLEMENTARY LESSON 1 DISCOVER HOW THE WORLD REALLY WORKS ASX Schools Sharemarket Game THE ASX CHARTS SUPPLEMENTARY LESSON 1 THE ASX CHARTS DISCOVER HOW THE WORLD REALLY WORKS 2015 ASX Schools Sharemarket Game The ASX charts When you spend time discovering a company s story and looking at company numbers

More information

STAT 113 Variability

STAT 113 Variability STAT 113 Variability Colin Reimer Dawson Oberlin College September 14, 2017 1 / 48 Outline Last Time: Shape and Center Variability Boxplots and the IQR Variance and Standard Deviaton Transformations 2

More information

Lab#3 Probability

Lab#3 Probability 36-220 Lab#3 Probability Week of September 19, 2005 Please write your name below, tear off this front page and give it to a teaching assistant as you leave the lab. It will be a record of your participation

More information

Chapter 2: Random Variables (Cont d)

Chapter 2: Random Variables (Cont d) Chapter : Random Variables (Cont d) Section.4: The Variance of a Random Variable Problem (1): Suppose that the random variable X takes the values, 1, 4, and 6 with probability values 1/, 1/6, 1/, and 1/6,

More information

The Standard Deviation as a Ruler and the Normal Model. Copyright 2009 Pearson Education, Inc.

The Standard Deviation as a Ruler and the Normal Model. Copyright 2009 Pearson Education, Inc. The Standard Deviation as a Ruler and the Normal Mol Copyright 2009 Pearson Education, Inc. The trick in comparing very different-looking values is to use standard viations as our rulers. The standard

More information

Homework: Due Wed, Nov 3 rd Chapter 8, # 48a, 55c and 56 (count as 1), 67a

Homework: Due Wed, Nov 3 rd Chapter 8, # 48a, 55c and 56 (count as 1), 67a Homework: Due Wed, Nov 3 rd Chapter 8, # 48a, 55c and 56 (count as 1), 67a Announcements: There are some office hour changes for Nov 5, 8, 9 on website Week 5 quiz begins after class today and ends at

More information

STAT Chapter 5: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.

STAT Chapter 5: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s. STAT 515 -- Chapter 5: Continuous Distributions Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s. Continuous distributions typically are represented by

More information

Homework Assignment Section 3

Homework Assignment Section 3 Homework Assignment Section 3 Tengyuan Liang Business Statistics Booth School of Business Problem 1 A company sets different prices for a particular stereo system in eight different regions of the country.

More information

CHAPTERS 5 & 6: CONTINUOUS RANDOM VARIABLES

CHAPTERS 5 & 6: CONTINUOUS RANDOM VARIABLES CHAPTERS 5 & 6: CONTINUOUS RANDOM VARIABLES DISCRETE RANDOM VARIABLE: Variable can take on only certain specified values. There are gaps between possible data values. Values may be counting numbers or

More information

Much of what appears here comes from ideas presented in the book:

Much of what appears here comes from ideas presented in the book: Chapter 11 Robust statistical methods Much of what appears here comes from ideas presented in the book: Huber, Peter J. (1981), Robust statistics, John Wiley & Sons (New York; Chichester). There are many

More information

SFSU FIN822 Project 1

SFSU FIN822 Project 1 SFSU FIN822 Project 1 This project can be done in a team of up to 3 people. Your project report must be accompanied by printouts of programming outputs. You could use any software to solve the problems.

More information

Numerical Descriptions of Data

Numerical Descriptions of Data Numerical Descriptions of Data Measures of Center Mean x = x i n Excel: = average ( ) Weighted mean x = (x i w i ) w i x = data values x i = i th data value w i = weight of the i th data value Median =

More information

GuruFocus User Manual: My Portfolios

GuruFocus User Manual: My Portfolios GuruFocus User Manual: My Portfolios 2018 version 1 Contents 1. Introduction to User Portfolios a. The User Portfolio b. Accessing My Portfolios 2. The My Portfolios Header a. Creating Portfolios b. Importing

More information

Monetary Economics Measuring Asset Returns. Gerald P. Dwyer Fall 2015

Monetary Economics Measuring Asset Returns. Gerald P. Dwyer Fall 2015 Monetary Economics Measuring Asset Returns Gerald P. Dwyer Fall 2015 WSJ Readings Readings this lecture, Cuthbertson Ch. 9 Readings next lecture, Cuthbertson, Chs. 10 13 Measuring Asset Returns Outline

More information

Optimization Methods in Management Science

Optimization Methods in Management Science Problem Set Rules: Optimization Methods in Management Science MIT 15.053, Spring 2013 Problem Set 6, Due: Thursday April 11th, 2013 1. Each student should hand in an individual problem set. 2. Discussing

More information