The Not-So-Geeky World of Statistics

Size: px
Start display at page:

Download "The Not-So-Geeky World of Statistics"

Transcription

1 FEBRUARY 3 5, 2015 / THE HILTON NEW YORK The Not-So-Geeky World of Statistics Chris Emerson Chris Sweet (a/k/a Chris 2 )

2 2 Who We Are Chris Sweet JPMorgan Chase VP, Outside Counsel & Engagement Management Chris Emerson Bryan Cave Director, Practice Economics

3 3 Backdrop Problem Statement GOAL: How long will it take to resolve these cases, and how much will it cost? Litigation Cases in Some State 206 Open Cases from Closed Cases pre Projected New Cases in July 2014 Dec attorneys available from various law firms

4 4 What to Expect Turning quantitative data into information Shape of the data Center of the data Spread of the data 3-Point Estimating Linear Programming in Excel Case study on understanding effort and expense in portfolios Going beyond the basics

5 5 Turning Quantitative Data Into Information Shape of the data (Distribution) Center of the data (Average) Spread of the data (Variability)

6 6 Example: Matter Hours Data For similar types of matters

7 FREQUENCY 7 Hours Data: Histogram A histogram is a graphical representation of the frequency in which ranges of measures occur MATTER HOURS

8 8 Histogram: How To

9 9 Shape of the Data Symmetric bell shaped other symmetric shapes Asymmetric skewed to the right skewed to the left Bimodal

10 10 Symmetric Distributions Bell-Shaped

11 11 Symmetric Distributions Mound-Shaped

12 12 Symmetric Distributions Uniform

13 13 Asymmetric Distributions Skewed to the Left

14 14 Asymmetric Distributions Skewed to the Right

15 15 Gamma Distribution Skewed to the Right

16 16 Bimodal

17 17 Shape Takeaways Helps you understand where your matters fall Could show how likely / unlikely certain circumstances are Bet the farm litigation Protracted rounds of negotiation Potentially useful for evaluating confidence levels Potentially useful for probability estimation

18 18 Outliers (The Tail) Extreme values, far from the rest of the data. May occur naturally May occur due to error in recording May occur due to error in measuring Observational unit may be fundamentally different

19 Example: Number of Timekeepers on Your Matters Number of Timekeepers

20 20 Center of the Data (Average) Mean Median Mode

21 Average or Mean ( ) 21 X Traditional measure of center Sum the values and divide by the number of values Example: Mean of the matter hours is Common Applications Average hours on a matter Average duration of a matter Highly influenced by extreme values (esp. outliers)

22 22 Median (M) A measure of the data s center, resistant to excess variation and outliers At least half of the ordered values are less than or equal to the median value At least half of the ordered values are greater than or equal to the median value Example: The median of the matter is hours is 165. Common Applications More useful than mean when looking at hours, duration, etc. in skewed data (remember from shape of data) In a normal distribution, the mean and median are roughly identical

23 23 Mode Measured as most frequently occurring number in a data set. It will show in a histogram is the highest column (most frequency) It can reflect the most common outcome Example: The mode of the matter hours is 180. Common Applications Mode is not commonly used other than to identify the most common outcome

24 24 Spread of the Data (Variability) If all values are the same, then they all equal the mean. There is no spread (variability). Variability exists when some values are different from (above or below) the mean. We will consider the following measures of spread: variance, standard deviation, and inter-quartile range

25 25 Variance and Standard Deviation When variability exists, each data value has an associated deviation from the mean: x xi What is a typical deviation from the mean? (standard deviation) Small values of this typical deviation indicate small spread in the data Large values of this typical deviation indicate large spread in the data

26 26 Standard Deviation typical deviation from the mean

27 27 Quartiles Three numbers that divide the ordered data into four equal sized groups. Q 1 has 25% of the data below it. Q 2 has 50% of the data below it. (Median) Q 3 has 75% of the data below it. Chapter 12

28 28 Five-Number Summary minimum = 100 Q 1 = Q 2 or M = 165 Q 3 = 185 maximum = 260 Inter-quartile Range (IQR) = Q 3 Q 1 = 57.5 IQR gives spread of middle 50% of the data ( middle 50% of data ranges from Q 1 to Q 3 )

29 29 Visual of 5-Number Summary min Q 1 M Q 3 max Matter Hours Boxplot of Matter Hours Data

30 30 Choosing a Summary Outliers affect the values of the mean and standard deviation. The five-number summary should be used to describe center and spread for skewed distributions, or when outliers are present. Use the mean and standard deviation for reasonably symmetric distributions that are free of outliers.

31 31 With the Mean and Standard Deviation of the Normal Distribution We Can Determine: What proportion of matters fall into any range of values At what percentile a given matter falls, if you know its value What value corresponds to a given percentile Because the area under a distribution equals 100%

32 32 PERT 3-Point Estimating Program Evaluation and Review Technique Pessimistic: 16 Hours Most likely: 8 Hours Optimistic: 4 Hours Expected Time Pessimistic + 4 Most Likely + Optimistic = 8.67 Hours Estimating technique that uses a weighted average of 3 numbers to come up with an estimate.

33 33 PERT Standard Deviation Measures the volatility of the estimate Std. Deviation Pessimistic Optimistic = 2 68% Confidence: /- 2 = Hours 95% Confidence: /- 4 = Hours 99.7% Confidence: /- 6 = Hours The larger the std. deviation, the less confidence you have in your estimate.

34 34 Linear Programming Linear programming (also called linear optimization) is a method to achieve the best outcome (such as maximum profit or lowest cost) in a mathematical model whose requirements are represented by linear relationships. Linear programming is a special case of mathematical programming.

35 35

36 The TBA Airlines Problem TBA Airlines is a small regional company that specializes in short flights in small airplanes. The company has been doing well and has decided to expand its operations. The basic issue facing management is whether to purchase more small airplanes to add some new short flights, or start moving into the national market by purchasing some large airplanes, or both. Question: How many airplanes of each type should be purchased to maximize their total net annual profit?

37 Data for the TBA Airlines Problem Small Airplane Large Airplane Net annual profit per airplane $1 million $5 million Capital Available Purchase cost per airplane $5 million $50 million $100 million Maximum purchase quantity 2

38 Linear Programming Formulation Let S = Number of small airplanes to purchase L = Number of large airplanes to purchase Maximize Profit = S + 5L ($millions) subject to Capital Available: 5S + 50L 100 ($millions) Max Small Planes: S 2 and S 0, L 0 S, L are integers.

39 Spreadsheet Model Small Airplane Large Airplane Unit Profit ($millions) 1 5 Capital Capital Capital Per Unit Produced Spent Available Capital ($millions) <= 100 Total Profit Small Airplane Large Airplane ($millions) Units Produced <= Maximum Small Airplanes 2 Capital Spent: =SUMPRODUCT(CapitalPerUnitProduced,UnitsProduced) Total Profit: =SUMPRODUCT(UnitProfit,UnitsProduced)

40 40 Excel s Solver

41 Spreadsheet Model Small Airplane Large Airplane Unit Profit ($millions) 1 5 Capital Capital Capital Per Unit Produced Spent Available Capital ($millions) <= 100 Total Profit Small Airplane Large Airplane ($millions) Units Produced <= Maximum Small Airplanes 2 Capital Spent: =SUMPRODUCT(CapitalPerUnitProduced,UnitsProduced) Total Profit: =SUMPRODUCT(UnitProfit,UnitsProduced)

42 42 Case Study: Predicting Portfolio Litigation Reserve Balances Case Study using Statistics, PERT, & Linear Programming

43 43 Backdrop Problem Statement GOAL: How long will it take to resolve these cases, and how much will it cost? Litigation Cases in Some State 206 Open Cases from Closed Cases pre Projected New Cases in July 2014 Dec attorneys available from various law firms

44 44 Parameters Inbound cases Case life Hours Needed per month Hours available per month Attorney Rate Attorney Efficiency Level

45 45 Outside Counsel Parameters 9 Choices with different efficiency levels Efficiency Levels Rate Rate for lawyers 3% anticipated annual rate increase

46 46 Solution Methods Gamma Probability-based Predictions PERT 3-Point Estimating Linear programming model

47 Frequency 47 Inbound Cases June Case Arrival by Month Months ARRIVAL PERCENTAGES BY MONTH Oct 9% Nov 7% Dec 9% Jan 6% Feb 7% Mar 11% Sep 8% Apr 8% Aug 8% Jul 7% Jun 9% May 11%

48 48 Inbound Case Projection, July

49 FREQUENCY CUMULATIVE FREQUENCY Histogram & Gamma Fit 9% 8% 7% 6% 5% 4% 3% 2% 1% 0% MONTHS OF LIFE This distribution tells us the probability of a given matter closing within a certain amount of time. Frequency Gamma PDF Cumulative % Gamma CDF 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0%

50 FREQUENCY Cases in Quarters 2011 Histogram & Gamma Fit 25% 100% 90% 20% 80% 70% 15% 60% 50% 10% 40% 30% 5% 20% 0% MONTHS OF LIFE Frequency Gamma PDF Cumulative % Gamma CDF 10% 0%

51 51 Probability-Based Prediction Model Gamma Distribution Bound on the left at zero Right skewed Excel does most of this for you! Probability Density Function Excel: gamma.dist function Formula for Gamma Mean μ[x] = α β Cumulative Density Function Excel: gamma.dist function Formula for Gamma Variance Var x = α β 2

52 52 How the Predictions Work Actual Gamma Distribution Month Case Count Cumulative % % PDF CDF Cases Closing % 0.96% 0.254% 0.08% % 1.28% 1.037% 0.69% % 1.60% 2.123% 2.25% % 1.28% 3.277% 4.96% % 3.19% 4.332% 8.77% % 4.15% 5.193% 13.56% % 6.39% 5.817% 19.08% % 6.07% 6.202% 25.11% % 7.67% 6.367% 31.41% % 7.03% 6.345% 37.78% % 5.75% 6.172% 44.05% % 5.43% 5.885% 50.09% % 6.07% 5.518% 55.79% total cases in 2011 LEGALTECH 313 NEW total YORK cases / in FEBRUARY

53 53 Estimating Effort (PERT) Based upon sample closed case population of 858 cases.

54 HOURS 54 Boxplot of Monthly Hours First Month Second Month Third Month Fourth Month

55 55 Estimating Effort (PERT) Based upon sample closed case population of 858 cases. Expected Value = Optimistic + 4 Most Likely + Pessimistic 6

56 56 Calculating Production Hours Needed

57 57 Monthly Hours Needed

58 58 Objective Function Minimize Z = Decision Variable: X am = binary decision variable; whether an attorney was chosen to work a particular month for attorney a, a=1, a=2, 9 Parameters: 9 a=1 y=1 m=1 m = a month between July 2014 and December 2020 for month m, m=1, m=2, 78 C = rate per attorney in a particular year 7 Y = year between 2014 and 2020 for year y, y=1, y=2, 7 H m = hours available for an attorney to work during a particular month 78 X am C ay H m

59 59 Constraints Total hours performed each month must meet monthly production hour requirement: 9 a=1 X am E a H m P m A terminated attorney will not be rehired: X am 1 X am 0

60 60 Linear Programming Model opensolver.org

61 61 So what s the answer? We should be down to one half-time lawyer by 12/2017 We have will have spent $5.8M to: Finish the 206 open cases Finish 284 of the new cases Have 22 cases left remaining (the tail) The remainder should complete within 6 12 months The Heisenberg Uncertainty Principle Applies Here

62 Process Improvement Enables visibility into process improvement By improving upon the number of hours worked by month against the historical baseline By resolving / concluding matters faster than the baseline

63 63 Questions For a copy of this PPT or the excel template, cemerson@bryancave.com

64 64 Additional Resources Anderson, David, Dennis Sweeney, Thomas Williams. Modern Business Statistics with Microsoft Office Excel, Fourth Edition. Mason: Cengage Learning, Document. Bishop, Christopher, and Michael Tipping. "Bayesian Regression and Classification." n.d. Microsoft Research. Web. 14 July Carrier, Richard. "Bayesian Calculator." Richard Carrier, Ph.D. Web. 14 July COIN OR. "Welcome to the Cbc home page." 1 July COIN OR. Web. 23 July "Does my data come from a gamma or beta distribution?" 11 May Cross Validated. Web. 16 July "Gamma distribution." 27 April Math Wiki. Web. 16 July Jacobs, Robert F. and Richard B. Chase. Operations and Supply Chain Management, Fourteenth Edition. New York: McGraw-Hill/Irwin, Document. "OpenSolver for Excel." 3 Jul OpenSolver.org. Web. 23 July Pasik-Duncan, Bozenna. "Example Distribution Generator." University of Kansas Mathematics Department. Web. 16 July Plymouth, University of. "Chapter 20 Probability." n.d. Centre for Innovation in Mathematics Teaching. Web. 15 July Quantitative Skills. "Distributions: Simple Interactive Statistical Analysis." n.d. Quantitative Skills. Web. 17 July Ricci, Vito. "Fitting Distributions with R." February The Comprehensive R Archive Network. Web. 16 July Stracener, Jerrell T. Ph.D. "Gamma Beta Distributions." SMU Lyle School of Engineering. Web. 17 July Technology, National Institute of Standards and. " Beta Distribution." n.d. Information Technology Laboratory. Web. 16 July "The Excel BETA.DIST Function." n.d. ExcelFunctions.net. Web. 16 July Wilmott Electronic Media Limited. "Gamma Distribution in Excel." 22 August Wilmott. Web. 16 July Wolfram Research. "Gamma Distribution." n.d. Wolfram MathWorld. Web. 16 July "Probability Density Function." n.d. Wolfram MathWorld. Web. 16 July Zaiontz, Charles. "Gamma Distribution." n.d. Real Statistics Using Excel. Web. 16 July 2014.

65 65 Going Beyond the Basics Correlation, Regression, Root Cause Analysis

The Range, the Inter Quartile Range (or IQR), and the Standard Deviation (which we usually denote by a lower case s).

The Range, the Inter Quartile Range (or IQR), and the Standard Deviation (which we usually denote by a lower case s). We will look the three common and useful measures of spread. The Range, the Inter Quartile Range (or IQR), and the Standard Deviation (which we usually denote by a lower case s). 1 Ameasure of the center

More information

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1 Chapter 3 Numerical Descriptive Measures Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1 Objectives In this chapter, you learn to: Describe the properties of central tendency, variation, and

More information

Misleading Graphs. Examples Compare unlike quantities Truncate the y-axis Improper scaling Chart Junk Impossible to interpret

Misleading Graphs. Examples Compare unlike quantities Truncate the y-axis Improper scaling Chart Junk Impossible to interpret Misleading Graphs Examples Compare unlike quantities Truncate the y-axis Improper scaling Chart Junk Impossible to interpret 1 Pretty Bleak Picture Reported AIDS cases 2 But Wait..! 3 Turk Incorporated

More information

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics. Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics. Convergent validity: the degree to which results/evidence from different tests/sources, converge on the same conclusion.

More information

Measures of Center. Mean. 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) Measure of Center. Notation. Mean

Measures of Center. Mean. 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) Measure of Center. Notation. Mean Measure of Center Measures of Center The value at the center or middle of a data set 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) 1 2 Mean Notation The measure of center obtained by adding the values

More information

Numerical Descriptions of Data

Numerical Descriptions of Data Numerical Descriptions of Data Measures of Center Mean x = x i n Excel: = average ( ) Weighted mean x = (x i w i ) w i x = data values x i = i th data value w i = weight of the i th data value Median =

More information

Lecture 2 Describing Data

Lecture 2 Describing Data Lecture 2 Describing Data Thais Paiva STA 111 - Summer 2013 Term II July 2, 2013 Lecture Plan 1 Types of data 2 Describing the data with plots 3 Summary statistics for central tendency and spread 4 Histograms

More information

appstats5.notebook September 07, 2016 Chapter 5

appstats5.notebook September 07, 2016 Chapter 5 Chapter 5 Describing Distributions Numerically Chapter 5 Objective: Students will be able to use statistics appropriate to the shape of the data distribution to compare of two or more different data sets.

More information

Section3-2: Measures of Center

Section3-2: Measures of Center Chapter 3 Section3-: Measures of Center Notation Suppose we are making a series of observations, n of them, to be exact. Then we write x 1, x, x 3,K, x n as the values we observe. Thus n is the total number

More information

1 Describing Distributions with numbers

1 Describing Distributions with numbers 1 Describing Distributions with numbers Only for quantitative variables!! 1.1 Describing the center of a data set The mean of a set of numerical observation is the familiar arithmetic average. To write

More information

Math 2200 Fall 2014, Exam 1 You may use any calculator. You may not use any cheat sheet.

Math 2200 Fall 2014, Exam 1 You may use any calculator. You may not use any cheat sheet. 1 Math 2200 Fall 2014, Exam 1 You may use any calculator. You may not use any cheat sheet. Warning to the Reader! If you are a student for whom this document is a historical artifact, be aware that the

More information

STAT 113 Variability

STAT 113 Variability STAT 113 Variability Colin Reimer Dawson Oberlin College September 14, 2017 1 / 48 Outline Last Time: Shape and Center Variability Boxplots and the IQR Variance and Standard Deviaton Transformations 2

More information

STA 248 H1S Winter 2008 Assignment 1 Solutions

STA 248 H1S Winter 2008 Assignment 1 Solutions 1. (a) Measures of location: STA 248 H1S Winter 2008 Assignment 1 Solutions i. The mean, 100 1=1 x i/100, can be made arbitrarily large if one of the x i are made arbitrarily large since the sample size

More information

Chapter 3. Descriptive Measures. Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 3, Slide 1

Chapter 3. Descriptive Measures. Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 3, Slide 1 Chapter 3 Descriptive Measures Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 3, Slide 1 Chapter 3 Descriptive Measures Mean, Median and Mode Copyright 2016, 2012, 2008 Pearson Education, Inc.

More information

Lecture 1: Review and Exploratory Data Analysis (EDA)

Lecture 1: Review and Exploratory Data Analysis (EDA) Lecture 1: Review and Exploratory Data Analysis (EDA) Ani Manichaikul amanicha@jhsph.edu 16 April 2007 1 / 40 Course Information I Office hours For questions and help When? I ll announce this tomorrow

More information

STATISTICAL DISTRIBUTIONS AND THE CALCULATOR

STATISTICAL DISTRIBUTIONS AND THE CALCULATOR STATISTICAL DISTRIBUTIONS AND THE CALCULATOR 1. Basic data sets a. Measures of Center - Mean ( ): average of all values. Characteristic: non-resistant is affected by skew and outliers. - Median: Either

More information

2 Exploring Univariate Data

2 Exploring Univariate Data 2 Exploring Univariate Data A good picture is worth more than a thousand words! Having the data collected we examine them to get a feel for they main messages and any surprising features, before attempting

More information

NOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS

NOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS NOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS A box plot is a pictorial representation of the data and can be used to get a good idea and a clear picture about the distribution of the data. It shows

More information

ก ก ก ก ก ก ก. ก (Food Safety Risk Assessment Workshop) 1 : Fundamental ( ก ( NAC 2010)) 2 3 : Excel and Statistics Simulation Software\

ก ก ก ก ก ก ก. ก (Food Safety Risk Assessment Workshop) 1 : Fundamental ( ก ( NAC 2010)) 2 3 : Excel and Statistics Simulation Software\ ก ก ก ก (Food Safety Risk Assessment Workshop) ก ก ก ก ก ก ก ก 5 1 : Fundamental ( ก 29-30.. 53 ( NAC 2010)) 2 3 : Excel and Statistics Simulation Software\ 1 4 2553 4 5 : Quantitative Risk Modeling Microbial

More information

Use of the Risk Driver Method in Monte Carlo Simulation of a Project Schedule

Use of the Risk Driver Method in Monte Carlo Simulation of a Project Schedule Use of the Risk Driver Method in Monte Carlo Simulation of a Project Schedule Presented to the 2013 ICEAA Professional Development & Training Workshop June 18-21, 2013 David T. Hulett, Ph.D. Hulett & Associates,

More information

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1 1 Chapter 1 1.1 Definitions Stat 101 Exam 1 - Embers Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2.

More information

Random Variables and Probability Distributions

Random Variables and Probability Distributions Chapter 3 Random Variables and Probability Distributions Chapter Three Random Variables and Probability Distributions 3. Introduction An event is defined as the possible outcome of an experiment. In engineering

More information

Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution

Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution PSY 464 Advanced Experimental Design Describing and Exploring Data The Normal Distribution 1 Overview/Outline Questions-problems? Exploring/Describing data Organizing/summarizing data Graphical presentations

More information

Some estimates of the height of the podium

Some estimates of the height of the podium Some estimates of the height of the podium 24 36 40 40 40 41 42 44 46 48 50 53 65 98 1 5 number summary Inter quartile range (IQR) range = max min 2 1.5 IQR outlier rule 3 make a boxplot 24 36 40 40 40

More information

3.1 Measures of Central Tendency

3.1 Measures of Central Tendency 3.1 Measures of Central Tendency n Summation Notation x i or x Sum observation on the variable that appears to the right of the summation symbol. Example 1 Suppose the variable x i is used to represent

More information

Order Making Fiscal Year 2018 Annual Adjustments to Transaction Fee Rates

Order Making Fiscal Year 2018 Annual Adjustments to Transaction Fee Rates This document is scheduled to be published in the Federal Register on 04/20/2018 and available online at https://federalregister.gov/d/2018-08339, and on FDsys.gov 8011-01p SECURITIES AND EXCHANGE COMMISSION

More information

Frequency Distribution and Summary Statistics

Frequency Distribution and Summary Statistics Frequency Distribution and Summary Statistics Dongmei Li Department of Public Health Sciences Office of Public Health Studies University of Hawai i at Mānoa Outline 1. Stemplot 2. Frequency table 3. Summary

More information

Center and Spread. Measures of Center and Spread. Example: Mean. Mean: the balance point 2/22/2009. Describing Distributions with Numbers.

Center and Spread. Measures of Center and Spread. Example: Mean. Mean: the balance point 2/22/2009. Describing Distributions with Numbers. Chapter 3 Section3-: Measures of Center Section 3-3: Measurers of Variation Section 3-4: Measures of Relative Standing Section 3-5: Exploratory Data Analysis Describing Distributions with Numbers The overall

More information

Describing Data: One Quantitative Variable

Describing Data: One Quantitative Variable STAT 250 Dr. Kari Lock Morgan The Big Picture Describing Data: One Quantitative Variable Population Sampling SECTIONS 2.2, 2.3 One quantitative variable (2.2, 2.3) Statistical Inference Sample Descriptive

More information

SOLUTIONS TO THE LAB 1 ASSIGNMENT

SOLUTIONS TO THE LAB 1 ASSIGNMENT SOLUTIONS TO THE LAB 1 ASSIGNMENT Question 1 Excel produces the following histogram of pull strengths for the 100 resistors: 2 20 Histogram of Pull Strengths (lb) Frequency 1 10 0 9 61 63 6 67 69 71 73

More information

Security Analysis: Performance

Security Analysis: Performance Security Analysis: Performance Independent Variable: 1 Yr. Mean ROR: 8.72% STD: 16.76% Time Horizon: 2/1993-6/2003 Holding Period: 12 months Risk-free ROR: 1.53% Ticker Name Beta Alpha Correlation Sharpe

More information

Copyright 2005 Pearson Education, Inc. Slide 6-1

Copyright 2005 Pearson Education, Inc. Slide 6-1 Copyright 2005 Pearson Education, Inc. Slide 6-1 Chapter 6 Copyright 2005 Pearson Education, Inc. Measures of Center in a Distribution 6-A The mean is what we most commonly call the average value. It is

More information

Introduction to Computational Finance and Financial Econometrics Descriptive Statistics

Introduction to Computational Finance and Financial Econometrics Descriptive Statistics You can t see this text! Introduction to Computational Finance and Financial Econometrics Descriptive Statistics Eric Zivot Summer 2015 Eric Zivot (Copyright 2015) Descriptive Statistics 1 / 28 Outline

More information

Lecture 3: Probability Distributions (cont d)

Lecture 3: Probability Distributions (cont d) EAS31116/B9036: Statistics in Earth & Atmospheric Sciences Lecture 3: Probability Distributions (cont d) Instructor: Prof. Johnny Luo www.sci.ccny.cuny.edu/~luo Dates Topic Reading (Based on the 2 nd Edition

More information

Standardized Data Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis

Standardized Data Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis Descriptive Statistics (Part 2) 4 Chapter Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis McGraw-Hill/Irwin Copyright 2009 by The McGraw-Hill Companies, Inc. Chebyshev s Theorem

More information

MATHEMATICS APPLIED TO BIOLOGICAL SCIENCES MVE PA 07. LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1)

MATHEMATICS APPLIED TO BIOLOGICAL SCIENCES MVE PA 07. LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1) LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1) Descriptive statistics are ways of summarizing large sets of quantitative (numerical) information. The best way to reduce a set of

More information

Statistics I Chapter 2: Analysis of univariate data

Statistics I Chapter 2: Analysis of univariate data Statistics I Chapter 2: Analysis of univariate data Numerical summary Central tendency Location Spread Form mean quartiles range coeff. asymmetry median percentiles interquartile range coeff. kurtosis

More information

9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives

9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives Basic Statistics for the Healthcare Professional 1 F R A N K C O H E N, M B B, M P A D I R E C T O R O F A N A L Y T I C S D O C T O R S M A N A G E M E N T, LLC Purpose of Statistic 2 Provide a numerical

More information

Integrated Cost Schedule Risk Analysis Using the Risk Driver Approach

Integrated Cost Schedule Risk Analysis Using the Risk Driver Approach Integrated Cost Schedule Risk Analysis Using the Risk Driver Approach Qatar PMI Meeting February 19, 2014 David T. Hulett, Ph.D. Hulett & Associates, LLC 1 The Traditional 3-point Estimate of Activity

More information

Description of Data I

Description of Data I Description of Data I (Summary and Variability measures) Objectives: Able to understand how to summarize the data Able to understand how to measure the variability of the data Able to use and interpret

More information

Unit 2 Statistics of One Variable

Unit 2 Statistics of One Variable Unit 2 Statistics of One Variable Day 6 Summarizing Quantitative Data Summarizing Quantitative Data We have discussed how to display quantitative data in a histogram It is useful to be able to describe

More information

Basic Procedure for Histograms

Basic Procedure for Histograms Basic Procedure for Histograms 1. Compute the range of observations (min. & max. value) 2. Choose an initial # of classes (most likely based on the range of values, try and find a number of classes that

More information

CHAPTER 2 Describing Data: Numerical

CHAPTER 2 Describing Data: Numerical CHAPTER Multiple-Choice Questions 1. A scatter plot can illustrate all of the following except: A) the median of each of the two variables B) the range of each of the two variables C) an indication of

More information

DATA SUMMARIZATION AND VISUALIZATION

DATA SUMMARIZATION AND VISUALIZATION APPENDIX DATA SUMMARIZATION AND VISUALIZATION PART 1 SUMMARIZATION 1: BUILDING BLOCKS OF DATA ANALYSIS 294 PART 2 PART 3 PART 4 VISUALIZATION: GRAPHS AND TABLES FOR SUMMARIZING AND ORGANIZING DATA 296

More information

Describing Uncertain Variables

Describing Uncertain Variables Describing Uncertain Variables L7 Uncertainty in Variables Uncertainty in concepts and models Uncertainty in variables Lack of precision Lack of knowledge Variability in space/time Describing Uncertainty

More information

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment Math 2311 Bekki George bekki@math.uh.edu Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment Class webpage: http://www.math.uh.edu/~bekki/math2311.html Math 2311 Class

More information

Integrated Cost Schedule Risk Analysis Using the Risk Driver Approach

Integrated Cost Schedule Risk Analysis Using the Risk Driver Approach Integrated Cost Schedule Risk Analysis Using the Risk Driver Approach David T. Hulett, Ph.D. Hulett & Associates 24rd Annual International IPM Conference Bethesda, Maryland 29 31 October 2012 (C) 2012

More information

Contents. An Overview of Statistical Applications CHAPTER 1. Contents (ix) Preface... (vii)

Contents. An Overview of Statistical Applications CHAPTER 1. Contents (ix) Preface... (vii) Contents (ix) Contents Preface... (vii) CHAPTER 1 An Overview of Statistical Applications 1.1 Introduction... 1 1. Probability Functions and Statistics... 1..1 Discrete versus Continuous Functions... 1..

More information

We will also use this topic to help you see how the standard deviation might be useful for distributions which are normally distributed.

We will also use this topic to help you see how the standard deviation might be useful for distributions which are normally distributed. We will discuss the normal distribution in greater detail in our unit on probability. However, as it is often of use to use exploratory data analysis to determine if the sample seems reasonably normally

More information

Maximum Likelihood Estimates for Alpha and Beta With Zero SAIDI Days

Maximum Likelihood Estimates for Alpha and Beta With Zero SAIDI Days Maximum Likelihood Estimates for Alpha and Beta With Zero SAIDI Days 1. Introduction Richard D. Christie Department of Electrical Engineering Box 35500 University of Washington Seattle, WA 98195-500 christie@ee.washington.edu

More information

Descriptive Statistics

Descriptive Statistics Petra Petrovics Descriptive Statistics 2 nd seminar DESCRIPTIVE STATISTICS Definition: Descriptive statistics is concerned only with collecting and describing data Methods: - statistical tables and graphs

More information

Exploring Data and Graphics

Exploring Data and Graphics Exploring Data and Graphics Rick White Department of Statistics, UBC Graduate Pathways to Success Graduate & Postdoctoral Studies November 13, 2013 Outline Summarizing Data Types of Data Visualizing Data

More information

Continuous Distributions

Continuous Distributions Quantitative Methods 2013 Continuous Distributions 1 The most important probability distribution in statistics is the normal distribution. Carl Friedrich Gauss (1777 1855) Normal curve A normal distribution

More information

Empirical Rule (P148)

Empirical Rule (P148) Interpreting the Standard Deviation Numerical Descriptive Measures for Quantitative data III Dr. Tom Ilvento FREC 408 We can use the standard deviation to express the proportion of cases that might fall

More information

Descriptive Analysis

Descriptive Analysis Descriptive Analysis HERTANTO WAHYU SUBAGIO Univariate Analysis Univariate analysis involves the examination across cases of one variable at a time. There are three major characteristics of a single variable

More information

4. DESCRIPTIVE STATISTICS

4. DESCRIPTIVE STATISTICS 4. DESCRIPTIVE STATISTICS Descriptive Statistics is a body of techniques for summarizing and presenting the essential information in a data set. Eg: Here are daily high temperatures for Jan 16, 2009 in

More information

Lecture Week 4 Inspecting Data: Distributions

Lecture Week 4 Inspecting Data: Distributions Lecture Week 4 Inspecting Data: Distributions Introduction to Research Methods & Statistics 2013 2014 Hemmo Smit So next week No lecture & workgroups But Practice Test on-line (BB) Enter data for your

More information

AP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE

AP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE AP STATISTICS Name: FALL SEMESTSER FINAL EXAM STUDY GUIDE Period: *Go over Vocabulary Notecards! *This is not a comprehensive review you still should look over your past notes, homework/practice, Quizzes,

More information

Math 140 Introductory Statistics. First midterm September

Math 140 Introductory Statistics. First midterm September Math 140 Introductory Statistics First midterm September 23 2010 Box Plots Graphical display of 5 number summary Q1, Q2 (median), Q3, max, min Outliers If a value is more than 1.5 times the IQR from the

More information

Manager Comparison Report June 28, Report Created on: July 25, 2013

Manager Comparison Report June 28, Report Created on: July 25, 2013 Manager Comparison Report June 28, 213 Report Created on: July 25, 213 Page 1 of 14 Performance Evaluation Manager Performance Growth of $1 Cumulative Performance & Monthly s 3748 3578 348 3238 368 2898

More information

Chapter 6. y y. Standardizing with z-scores. Standardizing with z-scores (cont.)

Chapter 6. y y. Standardizing with z-scores. Standardizing with z-scores (cont.) Starter Ch. 6: A z-score Analysis Starter Ch. 6 Your Statistics teacher has announced that the lower of your two tests will be dropped. You got a 90 on test 1 and an 85 on test 2. You re all set to drop

More information

Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need.

Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need. Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need. For exams (MD1, MD2, and Final): You may bring one 8.5 by 11 sheet of

More information

Math146 - Chapter 3 Handouts. The Greek Alphabet. Source: Page 1 of 39

Math146 - Chapter 3 Handouts. The Greek Alphabet. Source:   Page 1 of 39 Source: www.mathwords.com The Greek Alphabet Page 1 of 39 Some Miscellaneous Tips on Calculations Examples: Round to the nearest thousandth 0.92431 0.75693 CAUTION! Do not truncate numbers! Example: 1

More information

Fundamentals of Statistics

Fundamentals of Statistics CHAPTER 4 Fundamentals of Statistics Expected Outcomes Know the difference between a variable and an attribute. Perform mathematical calculations to the correct number of significant figures. Construct

More information

Handout 4 numerical descriptive measures part 2. Example 1. Variance and Standard Deviation for Grouped Data. mf N 535 = = 25

Handout 4 numerical descriptive measures part 2. Example 1. Variance and Standard Deviation for Grouped Data. mf N 535 = = 25 Handout 4 numerical descriptive measures part Calculating Mean for Grouped Data mf Mean for population data: µ mf Mean for sample data: x n where m is the midpoint and f is the frequency of a class. Example

More information

Numerical Measurements

Numerical Measurements El-Shorouk Academy Acad. Year : 2013 / 2014 Higher Institute for Computer & Information Technology Term : Second Year : Second Department of Computer Science Statistics & Probabilities Section # 3 umerical

More information

Descriptive Statistics Bios 662

Descriptive Statistics Bios 662 Descriptive Statistics Bios 662 Michael G. Hudgens, Ph.D. mhudgens@bios.unc.edu http://www.bios.unc.edu/ mhudgens 2008-08-19 08:51 BIOS 662 1 Descriptive Statistics Descriptive Statistics Types of variables

More information

Regression Analysis and Quantitative Trading Strategies. χtrading Butterfly Spread Strategy

Regression Analysis and Quantitative Trading Strategies. χtrading Butterfly Spread Strategy Regression Analysis and Quantitative Trading Strategies χtrading Butterfly Spread Strategy Michael Beven June 3, 2016 University of Chicago Financial Mathematics 1 / 25 Overview 1 Strategy 2 Construction

More information

DESCRIPTIVE STATISTICS II. Sorana D. Bolboacă

DESCRIPTIVE STATISTICS II. Sorana D. Bolboacă DESCRIPTIVE STATISTICS II Sorana D. Bolboacă OUTLINE Measures of centrality Measures of spread Measures of symmetry Measures of localization Mainly applied on quantitative variables 2 DESCRIPTIVE STATISTICS

More information

AP Statistics Chapter 6 - Random Variables

AP Statistics Chapter 6 - Random Variables AP Statistics Chapter 6 - Random 6.1 Discrete and Continuous Random Objective: Recognize and define discrete random variables, and construct a probability distribution table and a probability histogram

More information

Measures of Central Tendency Lecture 5 22 February 2006 R. Ryznar

Measures of Central Tendency Lecture 5 22 February 2006 R. Ryznar Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar Today s Content Wrap-up from yesterday Frequency Distributions The Mean, Median and Mode Levels of Measurement and Measures of Central

More information

Moments and Measures of Skewness and Kurtosis

Moments and Measures of Skewness and Kurtosis Moments and Measures of Skewness and Kurtosis Moments The term moment has been taken from physics. The term moment in statistical use is analogous to moments of forces in physics. In statistics the values

More information

Web Science & Technologies University of Koblenz Landau, Germany. Lecture Data Science. Statistics and Probabilities JProf. Dr.

Web Science & Technologies University of Koblenz Landau, Germany. Lecture Data Science. Statistics and Probabilities JProf. Dr. Web Science & Technologies University of Koblenz Landau, Germany Lecture Data Science Statistics and Probabilities JProf. Dr. Claudia Wagner Data Science Open Position @GESIS Student Assistant Job in Data

More information

Lecture Data Science

Lecture Data Science Web Science & Technologies University of Koblenz Landau, Germany Lecture Data Science Statistics Foundations JProf. Dr. Claudia Wagner Learning Goals How to describe sample data? What is mode/median/mean?

More information

Review: Chebyshev s Rule. Measures of Dispersion II. Review: Empirical Rule. Review: Empirical Rule. Auto Batteries Example, p 59.

Review: Chebyshev s Rule. Measures of Dispersion II. Review: Empirical Rule. Review: Empirical Rule. Auto Batteries Example, p 59. Review: Chebyshev s Rule Measures of Dispersion II Tom Ilvento STAT 200 Is based on a mathematical theorem for any data At least ¾ of the measurements will fall within ± 2 standard deviations from the

More information

Numerical Descriptive Measures. Measures of Center: Mean and Median

Numerical Descriptive Measures. Measures of Center: Mean and Median Steve Sawin Statistics Numerical Descriptive Measures Having seen the shape of a distribution by looking at the histogram, the two most obvious questions to ask about the specific distribution is where

More information

Honors Statistics. 3. Discuss homework C2# Discuss standard scores and percentiles. Chapter 2 Section Review day 2016s Notes.

Honors Statistics. 3. Discuss homework C2# Discuss standard scores and percentiles. Chapter 2 Section Review day 2016s Notes. Honors Statistics Aug 23-8:26 PM 3. Discuss homework C2#11 4. Discuss standard scores and percentiles Aug 23-8:31 PM 1 Feb 8-7:44 AM Sep 6-2:27 PM 2 Sep 18-12:51 PM Chapter 2 Modeling Distributions of

More information

DATA HANDLING Five-Number Summary

DATA HANDLING Five-Number Summary DATA HANDLING Five-Number Summary The five-number summary consists of the minimum and maximum values, the median, and the upper and lower quartiles. The minimum and the maximum are the smallest and greatest

More information

Summary of Information from Recapitulation Report Submittals (DR-489 series, DR-493, Central Assessment, Agricultural Schedule):

Summary of Information from Recapitulation Report Submittals (DR-489 series, DR-493, Central Assessment, Agricultural Schedule): County: Martin Study Type: 2014 - In-Depth The department approved your preliminary assessment roll for 2014. Roll approval statistical summary reports and graphics for 2014 are attached for additional

More information

The Normal Distribution & Descriptive Statistics. Kin 304W Week 2: Jan 15, 2012

The Normal Distribution & Descriptive Statistics. Kin 304W Week 2: Jan 15, 2012 The Normal Distribution & Descriptive Statistics Kin 304W Week 2: Jan 15, 2012 1 Questionnaire Results I received 71 completed questionnaires. Thank you! Are you nervous about scientific writing? You re

More information

ECON 214 Elements of Statistics for Economists

ECON 214 Elements of Statistics for Economists ECON 214 Elements of Statistics for Economists Session 3 Presentation of Data: Numerical Summary Measures Part 2 Lecturer: Dr. Bernardin Senadza, Dept. of Economics Contact Information: bsenadza@ug.edu.gh

More information

Descriptive Statistics

Descriptive Statistics Chapter 3 Descriptive Statistics Chapter 2 presented graphical techniques for organizing and displaying data. Even though such graphical techniques allow the researcher to make some general observations

More information

NOTES: Chapter 4 Describing Data

NOTES: Chapter 4 Describing Data NOTES: Chapter 4 Describing Data Intro to Statistics COLYER Spring 2017 Student Name: Page 2 Section 4.1 ~ What is Average? Objective: In this section you will understand the difference between the three

More information

Mini-Lecture 3.1 Measures of Central Tendency

Mini-Lecture 3.1 Measures of Central Tendency Mini-Lecture 3.1 Measures of Central Tendency Objectives 1. Determine the arithmetic mean of a variable from raw data 2. Determine the median of a variable from raw data 3. Explain what it means for a

More information

MgtOp 215 TEST 1 (Golden) Spring 2016 Dr. Ahn. Read the following instructions very carefully before you start the test.

MgtOp 215 TEST 1 (Golden) Spring 2016 Dr. Ahn. Read the following instructions very carefully before you start the test. MgtOp 15 TEST 1 (Golden) Spring 016 Dr. Ahn Name: ID: Section (Circle one): 4, 5, 6 Read the following instructions very carefully before you start the test. This test is closed book and notes; one summary

More information

Some Characteristics of Data

Some Characteristics of Data Some Characteristics of Data Not all data is the same, and depending on some characteristics of a particular dataset, there are some limitations as to what can and cannot be done with that data. Some key

More information

Prepared By. Handaru Jati, Ph.D. Universitas Negeri Yogyakarta.

Prepared By. Handaru Jati, Ph.D. Universitas Negeri Yogyakarta. Prepared By Handaru Jati, Ph.D Universitas Negeri Yogyakarta handaru@uny.ac.id Chapter 7 Statistical Analysis with Excel Chapter Overview 7.1 Introduction 7.2 Understanding Data 7.2.1 Descriptive Statistics

More information

Chapter 3 Descriptive Statistics: Numerical Measures Part A

Chapter 3 Descriptive Statistics: Numerical Measures Part A Slides Prepared by JOHN S. LOUCKS St. Edward s University Slide 1 Chapter 3 Descriptive Statistics: Numerical Measures Part A Measures of Location Measures of Variability Slide Measures of Location Mean

More information

Business Statistics 41000: Probability 3

Business Statistics 41000: Probability 3 Business Statistics 41000: Probability 3 Drew D. Creal University of Chicago, Booth School of Business February 7 and 8, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office: 404

More information

Probability & Statistics Modular Learning Exercises

Probability & Statistics Modular Learning Exercises Probability & Statistics Modular Learning Exercises About The Actuarial Foundation The Actuarial Foundation, a 501(c)(3) nonprofit organization, develops, funds and executes education, scholarship and

More information

Midterm Exam. b. What are the continuously compounded returns for the two stocks?

Midterm Exam. b. What are the continuously compounded returns for the two stocks? University of Washington Fall 004 Department of Economics Eric Zivot Economics 483 Midterm Exam This is a closed book and closed note exam. However, you are allowed one page of notes (double-sided). Answer

More information

Numerical summary of data

Numerical summary of data Numerical summary of data Introduction to Statistics Measures of location: mode, median, mean, Measures of spread: range, interquartile range, standard deviation, Measures of form: skewness, kurtosis,

More information

Summarising Data. Summarising Data. Examples of Types of Data. Types of Data

Summarising Data. Summarising Data. Examples of Types of Data. Types of Data Summarising Data Summarising Data Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester Today we will consider Different types of data Appropriate ways to summarise these data 17/10/2017

More information

Frequency Distribution Models 1- Probability Density Function (PDF)

Frequency Distribution Models 1- Probability Density Function (PDF) Models 1- Probability Density Function (PDF) What is a PDF model? A mathematical equation that describes the frequency curve or probability distribution of a data set. Why modeling? It represents and summarizes

More information

David Tenenbaum GEOG 090 UNC-CH Spring 2005

David Tenenbaum GEOG 090 UNC-CH Spring 2005 Simple Descriptive Statistics Review and Examples You will likely make use of all three measures of central tendency (mode, median, and mean), as well as some key measures of dispersion (standard deviation,

More information

1 Exercise One. 1.1 Calculate the mean ROI. Note that the data is not grouped! Below you find the raw data in tabular form:

1 Exercise One. 1.1 Calculate the mean ROI. Note that the data is not grouped! Below you find the raw data in tabular form: 1 Exercise One Note that the data is not grouped! 1.1 Calculate the mean ROI Below you find the raw data in tabular form: Obs Data 1 18.5 2 18.6 3 17.4 4 12.2 5 19.7 6 5.6 7 7.7 8 9.8 9 19.9 10 9.9 11

More information

Advanced Budgeting Workshop. Contents are subject to change. For the latest updates visit

Advanced Budgeting Workshop. Contents are subject to change. For the latest updates visit Advanced Budgeting Workshop Page 1 of 8 Why Attend 'Advanced Budgeting Workshop' is the second level course in budgeting after Meirc's 'Effective Budgeting and Cost ' course. It goes beyond the theory

More information

Statistics (This summary is for chapters 17, 28, 29 and section G of chapter 19)

Statistics (This summary is for chapters 17, 28, 29 and section G of chapter 19) Statistics (This summary is for chapters 17, 28, 29 and section G of chapter 19) Mean, Median, Mode Mode: most common value Median: middle value (when the values are in order) Mean = total how many = x

More information

Financial Markets 11-1

Financial Markets 11-1 Financial Markets Laurent Calvet calvet@hec.fr John Lewis john.lewis04@imperial.ac.uk Topic 11: Measuring Financial Risk HEC MBA Financial Markets 11-1 Risk There are many types of risk in financial transactions

More information

Math Take Home Quiz on Chapter 2

Math Take Home Quiz on Chapter 2 Math 116 - Take Home Quiz on Chapter 2 Show the calculations that lead to the answer. Due date: Tuesday June 6th Name Time your class meets Provide an appropriate response. 1) A newspaper surveyed its

More information