These characteristics are expressed in terms of statistical properties which are estimated from the sample data.

Similar documents
Chapter 8. Confidence Interval Estimation. Copyright 2015, 2012, 2009 Pearson Education, Inc. Chapter 8, Slide 1

point estimator a random variable (like P or X) whose values are used to estimate a population parameter

Inferential Statistics and Probability a Holistic Approach. Inference Process. Inference Process. Chapter 8 Slides. Maurice Geraghty,

Statistics for Economics & Business

A point estimate is the value of a statistic that estimates the value of a parameter.

Introduction to Probability and Statistics Chapter 7

1 Random Variables and Key Statistics

. (The calculated sample mean is symbolized by x.)

SCHOOL OF ACCOUNTING AND BUSINESS BSc. (APPLIED ACCOUNTING) GENERAL / SPECIAL DEGREE PROGRAMME

A random variable is a variable whose value is a numerical outcome of a random phenomenon.

Standard Deviations for Normal Sampling Distributions are: For proportions For means _


Lecture 5: Sampling Distribution

5. Best Unbiased Estimators

ii. Interval estimation:

AY Term 2 Mock Examination

Combining imperfect data, and an introduction to data assimilation Ross Bannister, NCEO, September 2010

Lecture 4: Parameter Estimation and Confidence Intervals. GENOME 560 Doug Fowler, GS

Chapter 8 Interval Estimation. Estimation Concepts. General Form of a Confidence Interval

Lecture 4: Probability (continued)

Sampling Distributions and Estimation

An Empirical Study of the Behaviour of the Sample Kurtosis in Samples from Symmetric Stable Distributions


I. Measures of Central Tendency: -Allow us to summarize an entire data set with a single value (the midpoint).

Sampling Distributions & Estimators

14.30 Introduction to Statistical Methods in Economics Spring 2009

Maximum Empirical Likelihood Estimation (MELE)

SOLUTION QUANTITATIVE TOOLS IN BUSINESS NOV 2011

Chpt 5. Discrete Probability Distributions. 5-3 Mean, Variance, Standard Deviation, and Expectation

18.S096 Problem Set 5 Fall 2013 Volatility Modeling Due Date: 10/29/2013

Quantitative Analysis

Topic-7. Large Sample Estimation

CHAPTER 8 Estimating with Confidence

Basic formula for confidence intervals. Formulas for estimating population variance Normal Uniform Proportion

Subject CT5 Contingencies Core Technical. Syllabus. for the 2011 Examinations. The Faculty of Actuaries and Institute of Actuaries.

Exam 2. Instructor: Cynthia Rudin TA: Dimitrios Bisias. October 25, 2011

CAPITAL PROJECT SCREENING AND SELECTION

Exam 1 Spring 2015 Statistics for Applications 3/5/2015

BASIC STATISTICS ECOE 1323

Unbiased estimators Estimators

Chapter 8: Estimation of Mean & Proportion. Introduction

Estimating Proportions with Confidence

Department of Mathematics, S.R.K.R. Engineering College, Bhimavaram, A.P., India 2

Institute of Actuaries of India Subject CT5 General Insurance, Life and Health Contingencies

CHAPTER 8: CONFIDENCE INTERVAL ESTIMATES for Means and Proportions

CHAPTER 8: CONFIDENCE INTERVAL ESTIMATES for Means and Proportions

Measures of Location and Variability

5 Statistical Inference

Dr. Maddah ENMG 624 Financial Eng g I 03/22/06. Chapter 6 Mean-Variance Portfolio Theory

Outline. Populations. Defs: A (finite) population is a (finite) set P of elements e. A variable is a function v : P IR. Population and Characteristics

Quantitative Analysis

Online appendices from Counterparty Risk and Credit Value Adjustment a continuing challenge for global financial markets by Jon Gregory

Today: Finish Chapter 9 (Sections 9.6 to 9.8 and 9.9 Lesson 3)

Variance and Standard Deviation (Tables) Lecture 10

Topic 14: Maximum Likelihood Estimation

Sampling Distributions and Estimation

BIOSTATS 540 Fall Estimation Page 1 of 72. Unit 6. Estimation. Use at least twelve observations in constructing a confidence interval

NOTES ON ESTIMATION AND CONFIDENCE INTERVALS. 1. Estimation

x satisfying all regularity conditions. Then

Confidence Intervals based on Absolute Deviation for Population Mean of a Positively Skewed Distribution

Math 124: Lecture for Week 10 of 17

CAPITAL ASSET PRICING MODEL

r i = a i + b i f b i = Cov[r i, f] The only parameters to be estimated for this model are a i 's, b i 's, σe 2 i

Section 3.3 Exercises Part A Simplify the following. 1. (3m 2 ) 5 2. x 7 x 11

Lecture 5 Point Es/mator and Sampling Distribu/on

0.1 Valuation Formula:

Proceedings of the 5th WSEAS Int. Conf. on SIMULATION, MODELING AND OPTIMIZATION, Corfu, Greece, August 17-19, 2005 (pp )

MODIFICATION OF HOLT S MODEL EXEMPLIFIED BY THE TRANSPORT OF GOODS BY INLAND WATERWAYS TRANSPORT

Statistics for Business and Economics

1 Basic Growth Models

Bayes Estimator for Coefficient of Variation and Inverse Coefficient of Variation for the Normal Distribution

Monetary Economics: Problem Set #5 Solutions

Control Charts for Mean under Shrinkage Technique

4.5 Generalized likelihood ratio test

Confidence Intervals Introduction

ST 305: Exam 2 Fall 2014

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

Models of Asset Pricing

Models of Asset Pricing

Binomial Model. Stock Price Dynamics. The Key Idea Riskless Hedge

B = A x z

1. Suppose X is a variable that follows the normal distribution with known standard deviation σ = 0.3 but unknown mean µ.

CHAPTER 2 PRICING OF BONDS

Models of Asset Pricing

Estimating possible rate of injuries in coal mines

Parametric Density Estimation: Maximum Likelihood Estimation

Chapter 10 - Lecture 2 The independent two sample t-test and. confidence interval

ACTUARIAL RESEARCH CLEARING HOUSE 1990 VOL. 2 INTEREST, AMORTIZATION AND SIMPLICITY. by Thomas M. Zavist, A.S.A.

Parameter Uncertainty in Loss Ratio Distributions and its Implications

CHAPTER 8 CONFIDENCE INTERVALS

Appendix 1 to Chapter 5

APPLICATION OF GEOMETRIC SEQUENCES AND SERIES: COMPOUND INTEREST AND ANNUITIES

APPLIED STATISTICS Complementary Course of BSc Mathematics - IV Semester CUCBCSS Admn onwards Question Bank

CHAPTER 3 RESEARCH METHODOLOGY. Chaigusin (2011) mentioned that stock markets have different

PORTFOLIO THEORY FOR EARTHQUAKE INSURANCE RISK ASSESSMENT

of Asset Pricing R e = expected return

Data Analysis and Statistical Methods Statistics 651

1 Estimating the uncertainty attached to a sample mean: s 2 vs.

1 Estimating sensitivities

of Asset Pricing APPENDIX 1 TO CHAPTER EXPECTED RETURN APPLICATION Expected Return

Transcription:

0. Key Statistical Measures of Data Four pricipal features which characterize a set of observatios o a radom variable are: (i) the cetral tedecy or the value aroud which all other values are buched, (ii) the spread of the sample data aroud mea, (iii) the asymmetry or skewess of the spread of data, ad (iv) the peakedess of the data. These characteristics are expressed i terms of statistical properties which are estimated from the sample data. 0.. Measures of Cetral Tedecy I statistics various measures of cetral tedecy are employed. Three importat measures are the followig. (i) Arithmetic Mea: If x, x x represet a series of observatios, the mea of this series is: x = i= x i (0.5) Where x represets the sample mea; the mea of populatio is geerally deoted by. (ii) Mode: It is the value which occurs most frequetly. It is the peak value of the PDF. A data set may have more tha oe peak. (iii) Media: It is the middle value of the raked observatios for a data set. The media divides the distributio i two equal parts. 0.. Measure of Dispersio or Variatio Three statistical measures of variatio of data are commoly used. (i) Variace: It represets the scatter of the data are about the mea. Variace is computed by: s = i= (x i - x ) (0.) A small value of variace implies that values are buchig close to the mea.

(ii) Stadard Deviatio (SD): The ubiased estimate of populatio stadard deviatio (s) is give computed as the square root of the variace: s = [ ( xi - x ) ] i= 0.5 (0.7) whe < 0, the ubiased estimate of s is foud by replacig by - i the deomiator. Greek letter σ is used to deote the stadard deviatio of populatio. (iii) Coefficiet of Variatio (CV) is a dimesioless parameter ad is obtaied by dividig the stadard deviatio by the mea: C V = s / x (0.8) Whe the mea of the data is zero, C v is ot defied. This coefficiet is useful to compare differet populatios. Give two samples of data, the oe with larger C v will have more spread of the values aroud the mea. Example 0.: Average aual flows (i cumec) at a river gaugig site are give i the table below. Compute the mea, variace, stadard deviatio, ad the coefficiet of variatio of the flows. Year 70 7 7 7 74 75 7 77 78 7 80 Flow (cumec) 5.5 45.4 48. 4.7 05. 0. 0. 4.4 7..8.0 Year 8 8 8 84 85 8 87 88 8 0 Flow (cumec).4 40. 45...5 0.0 0..8 80.5. 4. Solutio: We have a total of values. The mea of the flows ca be computed as Mea = (5.5 + 45.4 + 48. +. 80.5 +. + 4.)/ = 7.8 cumec. The variace ca be computed by eq. (0.) Variace s = [(5.5-7.8) + (45.4 - -7.8) + (48.-7.8) + + (.-7.8) + (4.-7.8) )]/ =.4 cumec SD s = (.4)0.5 = 8. cumec CV = 8./7.8 = 0.47. 0.. Measures of Symmetry Usually the hydrologic data are ot distributed symmetrically aroud the mea. If the data to the

right of the mea are more spread out tha those o the left the, by covetio, the asymmetry is positive ad vice versa for egative asymmetry (see figure 0.4).If the data are symmetrically placed aroud the mea the the measure of symmetry would be zero. by: The third momet of the data about the mea is used i idicatig symmetry ad is give M = i = (x i - x ) (0.) It is easy to see that this momet is zero if the data are symmetrical. Otherwise, M will have certai value, a positive or egative. Note that because the third cetral momet has dimesios equal to the cube of the data, it is ot useful while comparig differet data sets. Beig o-dimesioal the coefficiet of skewess does ot have this disadvatage ad is preferred. Coefficiet of Skewess: A o-dimesioal measure of the asymmetry of the distributio of the data is helpful whe various data are to be compared ad the coefficiet skewess is oe such measure. The coefficiet of skewess (C s ) is give by: = C s i (x - x) ( - ) ( ) s i (0.0) Symmetrical frequecy distributios have very small or egligible value for skewess coefficiet C s, while asymmetrical frequecy distributios have either positive or egative coefficiets. Whe C s has a small value, it idicates that the probability distributio may be approximated by the ormal distributio sice C s = 0 for this distributio. The symmetrical ad skewed distributios are show i Fig. 0.4. Negative skew Mea Media Mode Zero skew Mea = Media = Mode Positive skew Mode Media Mea f(x ) f(x ) f(x ) x --

Fig. 0.4 Symmetrical ad asymmetrical (+ve ad ve) skewed distributios. 0..4 Measures of Peakedess or Flatess The measure used to deote the peakedess or the flatess of the frequecy distributio ear its cetre is kow as the kurtosis coefficiet. This coefficiet is computed by: ( - x) 4 xi Ck = i ( -)( - )( - )s 4 (0.) Normal distributio has the kurtosis. If a data set has a relatively greater cocetratio ear the mea tha the ormal distributio, the kurtosis will be greater tha. Coversely, if the data have a relatively smaller cocetratio ear the mea tha the ormal distributio, the kurtosis will be less tha. Example 0.: Compute the coefficiet of skewess ad the coefficiet kurtosis of the data of example 0.. Solutio: The coefficiet of skewess ca be computed by eq. (0.0) C s = [/(*0*8. )]*[(5.5-7.8) + (45.4 - -7.8) + + (.-7.8) + (4.-7.8) )] =.8 A positive value of C S implies that the probability distributio of the data has heavy tail to the right. Kurtosis ca be computed by eq. () C k = [*/(*0**8. 4 )]*[(5.5-7.8) 4 + (45.4 - -7.8) 4 + + (.-7.8) 4 + (4.-7.8) 4 )] =.45 Sice kurtosis is less tha, it meas that the data values are less cocetrated aroud the mea tha the ormal distributio or the peak of the distributio will be flatter compared to the

ormal distributio. 0. Graphical Presetatio of Data Graphically presetatio helps i a good isight i the behavior ad variatio of the data. To graphically preset the data i the form of histograms, a frequecy table is prepared. For this purpose the rage of the data is divided ito a umber of itervals of coveiet size ad frequecies of values occurrig i each iterval is etered alogside. The appearace of a frequecy histogram depeds upo the selectio of class iterval. If the class itervals are very large, the table is compact but details may be lost. If the itervals are too small, the table may be too bulky. The followig guidelies may be cosidered while choosig the class iterval (a) Brooks ad Carruthers rough guide: Number of classes 5 log (sample size) (0.) (b) Charlier's rule of thumb: w = (maximum value miimum value)/0 (0.) where w is the size of class iterval. I geeral the umber of classes varies betwee ad 5. To prepare the frequecy table, steps give below ca be followed: (i) Arrage the variable (X i ) i icreasig or decreasig order of magitude. (ii) Decide the umber of class itervals (NC) ad the size of the class iterval X. (iii) Divide the ordered observatios X i ito NC itervals. (iv) Determie the absolute frequecy j as the umber of observatios that fall i the j th class iterval, j=,... NC. (v) Compute the relative frequecies of various classes as j /, j=,... NC ad is the umber of observatios. (vi) Compute the cumulative relative frequecies F j, j =,... NC. (vii) Plot the relative frequecies as well as cumulative relative frequecies with group iterval as abscissa ad the relative frequecies or cumulative relative frequecies as ordiate. Example 0.: The aual flow of Sabarmati River at Dharoi is plotted i Fig. 0.5 for the

period 88-5. Plot the histogram ad the cumulativee histogram. Fig. 0.5 Plot of the aual flow of the river. Solutio: After examiig the data havig 8 values, thee class iterval was chose as 00 MCM. Table 0. shows the mid-values of classes i which thee data has bee divided, the frequecy of values i each class ad the cumulative frequecies. There are 7 classes. Table 0. Mid-values ad frequecies of various classess of examplee data. Mid-value of class (MCM) 50 50 50 450 550 50 750 850 50 050 50 50 Frequecy 5 Cumulative frequecy 5 5 44 5 7 78 84 85 0

50 450 550 50 750 0 0 5 5 7 7 8 The cumulative histogram of the aual flow of Sabarmati River att Dharoi for the period 88-5 (8 years) is plotted i Fig. 0.. Fig. 0. Histogram of the aual river flows. Fig. 0.7 Cumulative histogram of aual flows of the river.