NOTES ON ESTIMATION AND CONFIDENCE INTERVALS. 1. Estimation

Similar documents
Chapter 8. Confidence Interval Estimation. Copyright 2015, 2012, 2009 Pearson Education, Inc. Chapter 8, Slide 1

Lecture 5 Point Es/mator and Sampling Distribu/on

Statistics for Economics & Business

Inferential Statistics and Probability a Holistic Approach. Inference Process. Inference Process. Chapter 8 Slides. Maurice Geraghty,

. (The calculated sample mean is symbolized by x.)

Statistics for Business and Economics

Sampling Distributions and Estimation

Topic-7. Large Sample Estimation

18.S096 Problem Set 5 Fall 2013 Volatility Modeling Due Date: 10/29/2013

Introduction to Probability and Statistics Chapter 7

Lecture 4: Parameter Estimation and Confidence Intervals. GENOME 560 Doug Fowler, GS

5 Statistical Inference

Chapter 10 - Lecture 2 The independent two sample t-test and. confidence interval

Sampling Distributions & Estimators

BASIC STATISTICS ECOE 1323

A point estimate is the value of a statistic that estimates the value of a parameter.

point estimator a random variable (like P or X) whose values are used to estimate a population parameter

Exam 2. Instructor: Cynthia Rudin TA: Dimitrios Bisias. October 25, 2011

Chapter 8 Interval Estimation. Estimation Concepts. General Form of a Confidence Interval

ii. Interval estimation:

CHAPTER 8 Estimating with Confidence

5. Best Unbiased Estimators

14.30 Introduction to Statistical Methods in Economics Spring 2009

Bayes Estimator for Coefficient of Variation and Inverse Coefficient of Variation for the Normal Distribution

Chapter 8: Estimation of Mean & Proportion. Introduction

1 Random Variables and Key Statistics

ST 305: Exam 2 Fall 2014

Today: Finish Chapter 9 (Sections 9.6 to 9.8 and 9.9 Lesson 3)

Exam 1 Spring 2015 Statistics for Applications 3/5/2015

CHAPTER 8: CONFIDENCE INTERVAL ESTIMATES for Means and Proportions

Parametric Density Estimation: Maximum Likelihood Estimation

Estimating Proportions with Confidence

Sampling Distributions and Estimation

Confidence Intervals Introduction

Basic formula for confidence intervals. Formulas for estimating population variance Normal Uniform Proportion

STAT 135 Solutions to Homework 3: 30 points

Lecture 4: Probability (continued)

AY Term 2 Mock Examination

Math 124: Lecture for Week 10 of 17

BIOSTATS 540 Fall Estimation Page 1 of 72. Unit 6. Estimation. Use at least twelve observations in constructing a confidence interval

x satisfying all regularity conditions. Then

CHAPTER 8: CONFIDENCE INTERVAL ESTIMATES for Means and Proportions

A random variable is a variable whose value is a numerical outcome of a random phenomenon.

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

Standard Deviations for Normal Sampling Distributions are: For proportions For means _

Outline. Populations. Defs: A (finite) population is a (finite) set P of elements e. A variable is a function v : P IR. Population and Characteristics

Estimation of Population Variance Utilizing Auxiliary Information

Research Article The Probability That a Measurement Falls within a Range of n Standard Deviations from an Estimate of the Mean

1. Suppose X is a variable that follows the normal distribution with known standard deviation σ = 0.3 but unknown mean µ.

Confidence Intervals based on Absolute Deviation for Population Mean of a Positively Skewed Distribution

FINM6900 Finance Theory How Is Asymmetric Information Reflected in Asset Prices?

An Empirical Study of the Behaviour of the Sample Kurtosis in Samples from Symmetric Stable Distributions

Lecture 5: Sampling Distribution

Online appendices from Counterparty Risk and Credit Value Adjustment a continuing challenge for global financial markets by Jon Gregory

4.5 Generalized likelihood ratio test

r i = a i + b i f b i = Cov[r i, f] The only parameters to be estimated for this model are a i 's, b i 's, σe 2 i

Monetary Economics: Problem Set #5 Solutions

These characteristics are expressed in terms of statistical properties which are estimated from the sample data.

SCHOOL OF ACCOUNTING AND BUSINESS BSc. (APPLIED ACCOUNTING) GENERAL / SPECIAL DEGREE PROGRAMME

An Improved Estimator of Population Variance using known Coefficient of Variation

A Bayesian perspective on estimating mean, variance, and standard-deviation from data

Control Charts for Mean under Shrinkage Technique

B = A x z

Topic 14: Maximum Likelihood Estimation

ECON 5350 Class Notes Maximum Likelihood Estimation

Binomial Model. Stock Price Dynamics. The Key Idea Riskless Hedge

Quantitative Analysis

FOUNDATION ACTED COURSE (FAC)

The Idea of a Confidence Interval

CAPITAL ASSET PRICING MODEL

1 Estimating the uncertainty attached to a sample mean: s 2 vs.

Unbiased estimators Estimators

Combining imperfect data, and an introduction to data assimilation Ross Bannister, NCEO, September 2010

0.1 Valuation Formula:

Quantitative Analysis

Chapter 10 Statistical Inference About Means and Proportions With Two Populations. Learning objectives

Correlation possibly the most important and least understood topic in finance

Estimation of Parameters of Three Parameter Esscher Transformed Laplace Distribution

Department of Mathematics, S.R.K.R. Engineering College, Bhimavaram, A.P., India 2

Chpt 5. Discrete Probability Distributions. 5-3 Mean, Variance, Standard Deviation, and Expectation

A New Constructive Proof of Graham's Theorem and More New Classes of Functionally Complete Functions

Parameter Uncertainty in Loss Ratio Distributions and its Implications

Maximum Empirical Likelihood Estimation (MELE)

SELECTING THE NUMBER OF CHANGE-POINTS IN SEGMENTED LINE REGRESSION


= α e ; x 0. Such a random variable is said to have an exponential distribution, with parameter α. [Here, view X as time-to-failure.

Kernel Density Estimation. Let X be a random variable with continuous distribution F (x) and density f(x) = d

1. Find the area under the standard normal curve between z = 0 and z = 3. (a) (b) (c) (d)

CHAPTER 8 CONFIDENCE INTERVALS

We analyze the computational problem of estimating financial risk in a nested simulation. In this approach,

Introduction to Statistical Inference

5 Decision Theory: Basic Concepts

1 Estimating sensitivities

The material in this chapter is motivated by Experiment 9.

Subject CT1 Financial Mathematics Core Technical Syllabus

Simulation Efficiency and an Introduction to Variance Reduction Methods

Models of Asset Pricing

Models of Asset Pricing

Subject CT5 Contingencies Core Technical. Syllabus. for the 2011 Examinations. The Faculty of Actuaries and Institute of Actuaries.

The Time Value of Money in Financial Management

Transcription:

NOTES ON ESTIMATION AND CONFIDENCE INTERVALS MICHAEL N. KATEHAKIS 1. Estimatio Estimatio is a brach of statistics that deals with estimatig the values of parameters of a uderlyig distributio based o observed/empirical data. The parameters describe the physical law that geerates the observed values of data. A estimator is a fuctio of the observable sample data that is used to estimate a ukow populatio parameter; a estimate is the result from the actual applicatio of the fuctio to a particular set of data. For example, to estimate the proportio p of a populatio of voters who will vote for a particular cadidate. That proportio is the uobservable parameter; the estimator i p X/ is based o a radom sample of voters. Ofte, may estimators are possible for a give parameter. Some are better tha others. The mai criteria used to choose oe estimator.over others are the ubiasedess, Cosistecy, Efficiecy ad Robustess. These properties are illustrated i the examples below. Bias - Mea Squared Error ad Variace of a Estimator For a poit estimator θ of a parameter θ, The error of θ is θ θ. The bias of θ is defied as B( θ) E( θ) θ. θ is a ubiased estimator of θ if ad oly if B( θ) 0 for all θ, or, equivaletly, if ad oly if E( θ) θ for all θ. The mea squared error of θ is defied as Note that MSE( θ) E[( θ θ) ]. MSE( θ) V ar( θ) + (B( θ)), i.e. mea squared error variace + square of bias. Biased estimator is a statistic whose expectatio differs from the value of the quatity beig estimated. 1

MICHAEL N. KATEHAKIS Stadard Error of a estimator θ is the square root of the variace of θ), i.e. the stadard deviatio of θ. Remark. The term bias is used for two differet cocepts. A biased sample is a statistical sample i which some members of the populatio are more likely to be chose i the sample tha others. A biased estimator is oe that for some reaso o average over- or uderestimates the quatity that is beig estimated. I the 1936 US presidetial electio polls, the Literary Digest held a poll that forecast that Alfred M. Lado would defeat Frakli Delao Roosevelt by 57% to 43%. George Gallup, usig a much smaller sample (300,000 rather tha,000,000), predicted Roosevelt would wi, ad he was right. What wet wrog with the Literary Digest poll? They had used lists of telephoe ad automobile owers to select their sample. I those days, these were luxuries, so their sample cosisted maily of middle- ad upperclass citizes. These voted i majority for Lado, but the lower classes voted for Roosevelt. Because Digest s sample was biased towards wealthier citizes, their forecast was icorrect, for the electio, eve though it did correctly predict the proportio of voters i the middle- ad upper-class that did vote for Lado! Example 1. Cosider a radom sample X 1,..., X of i.i.d. observatios from the Normal distributio with mea µ variace σ, where 3. Two estimators form the mea µ are: X X i /, ad X X i /. We have: (1) Both are ubiased : () Oly X is a cosistet estimator, EX µ, EX µ. X µ, for large. (3) The efficiecy of X is V ar(x ) σ / is better (smaller is better) tha that of X which is V ar(x ) σ /. Example. Cosider a radom sample X 1,..., X of i.i.d. observatios from the Normal distributio with mea µ variace σ, where 3. Two estimators for the variace σ 1 S (X i X ), 1 1 (X i X ). The σ is a biased estimator of σ with bias: B( σ )) E σ σ σ /, while the bias of S is B(S ) 0, i.e., S is ubiased. However, σ has a smaller variace, better efficiecy tha S. Ideed,

NOTES ON ESTIMATION AND CONFIDENCE INTERVALS 3 we have: E σ 1 σ V ar( σ ) ( 1) σ 4 ES 1 1 σ σ V ar(s ) ( 1) ( 1) σ4 ( 1) σ4. The above properties ca be show (Cochra s theorem i math statistics) usig the fact that (X i X) follows the χ 1 distributio. For this distributios is is kow that Eχ 1 1 ad V ar(χ 1 ) ( 1). Remark A direct proof of the ubiasedess of S ad biasedess of σ is as follows ( ( ) E X i X) E X i X ( ) E (X i µ) (X µ) { E ( (X i µ) ) ( ) E (X i µ)(x µ) + E ((X µ) )} σ 1 {σ σ ( 1)σ ( 1)σ E ((X i µ)(x j µ)) + 1 j1 + σ } j1 k1 σ E ((X j µ)(x k µ)) Usig the above we get the followig: ES 1 1 E (X i X) E σ 1 E (X i X) ( 1)σ 1 ( 1)σ σ. σ.

4 MICHAEL N. KATEHAKIS. Cofidece itervals - Iterval estimatio I Iterval estimatio we use sample data to calculate a iterval of probable values of a ukow populatio parameter. The most prevalet forms of iterval estimatio are cofidece itervals. A cofidece iterval (CI) for a populatio parameter is a iterval betwee two umbers with a associated probability p 1 α. They are computed from a radom sample of a uderlyig populatio, such that if the samplig was repeated umerous times ad the cofidece iterval recalculated from each sample accordig to the same method, a proportio p of the cofidece itervals would cotai the populatio parameter i questio..1. Simple examples. Suppose X 1,..., X are a idepedet sample from a ormally distributed populatio with mea ad variace. Let The ad X (X 1 + + X )/, S 1 1 ( Xi X ). Z X µ σ/, T X µ S/ have respectively a Normal distributio with mea 0 ad variace 1, ad a Studet s t-distributio with 1 degrees of freedom. Hece if σ is kow ad if σ is ot kow P [ z α/ Z z α/ ] 1 α, P [ t 1,α/ T t 1,α/ ] 1 α. We ca use the above to get cofidece itervals for the ukow parameter µ as follows: Hece, 1 α cofidece itervals for µ are: X z α/ σ/ µ X + z α/ σ/, if σ is kow X t 1,α/ S/ µ X + t 1,α/ S/, if σ is ot kow.

NOTES ON ESTIMATION AND CONFIDENCE INTERVALS 5.. The process of costructig cofidece itervals cosists of two steps. 1. Idetify a test statistic that has a kow distributio, idepedet of the parameter of iterest, see Z ad T above.. Compute the cofidece regio which is a set of all sample values A of the test statistic T S such that P ( T S A) 1 α. 3. Rephrase the relatio ito a relatio: T S A L papameter of iterest U Where L ad U are respectively the lower ad the upper limits of the cofidece iterval. I the simple examples above we have if σ is kow A [ z α/, z α/ ] ad L X z α/ σ/, U X + z α/ σ/. If σ is ot kow A [ t 1,α/, t 1,α/ ], L X t 1,α/ S/, U X t 1,α/ S/. We ext give a comprehesive list of test statistics. Table 1. Sigle Populatio meas Name Formula Assumptios Oe-sample z-test z X µ 0 σ Two-sample z-test z (X 1 X ) (µ 1 µ ) σ 1 + σ m Normal distributio, or 30, ad σ kow. Normal distributio ad idepedet observatios, σ 1 ad σ kow. Oe-sample t-test t X µ 0 S, Normal populatio, or 30, ad σ ukow. df 1

6 MICHAEL N. KATEHAKIS Table. Variaces Name Formula Assumptios Oe-sample χ test χ 1 ( 1)S σ Normal distributio, or 30, ad σ kow uder H 0. F -test S1 S F ( 1, m 1) Normal distributio ad idepedet. Table 3. Tests for Proportios. Name Formula Assumptios Oe-proportio z-test p p z, p(1 p) p 10 ad (1 p) 10. Two-proportio z-test z ( p 1 p ) (p 1 p ) p(1 p)( 1 + 1m ), p 1 6 ad (1 p 1 ) 6. equal variaces p X 1 + X + m mp 6 ad m(1 p ) 6. ( p 1 p ) (p 1 p ) Two-proportio z-test z p1 (1 p 1 ) + p, p 1 6 ad (1 p 1 ) 6. (1 p ) m uequal variaces mp 6 ad m(1 p ) 6.

NOTES ON ESTIMATION AND CONFIDENCE INTERVALS 7 Table 4. Other Commo test statistics. Name Formula Assumptios Paired t-test t D d 0, S d 30 ad σ ukow df 1 D i X i Y i Two-sample t (X 1 X ) (µ 1 µ ), Normal populatios, or + m > 40, pooled t-test S p 1 + 1 m Sp ( 1)S 1 + (m 1)S, + m ad σ 1 σ, where σ 1 σ, are ukow. df + m Two-sample t (X 1 X ) (µ 1 µ ) S 1 + S m upooled t-test df ( 1)(m 1) (m 1)c + ( 1)(1 c ), Normal populatios, or + m > 40, idepedet observatios ad σ 1 σ are ukow. c S1 S1 + S m or df mi{, m}

8 MICHAEL N. KATEHAKIS 3. Importat Distributios Studet s t-distributio Studet s t distributio is the distributio of the radom variable T 1 which is the best that we ca do whe we do ot kow σ. where T 1 X µ S /, S 1 1 (X i X ). The formula for the probability desity fuctio of the T distributio is Γ( + 1 ) f(x) Γ( 1 ( x ) +1 1 + ) π where is the shape parameter ad Γ is the gamma fuctio. The formula for the gamma fuctio is Γ(a) 0 t a 1 e t dt. As icreases, the t-distributio approaches the stadard ormal distributio. The mea, ad variace, of the t-distributio are: ad ET µ 0 σ (T ). The chi-square distributio. A statistic χ that results whe idepedet variables Z i with stadard ormal distributios are squared ad summed, has the χ distributio with degrees of freedom. χ The formula for the probability desity fuctio of the chi-square distributio is Z i,

NOTES ON ESTIMATION AND CONFIDENCE INTERVALS 9 f(x) e x/ x / 1, where x 0,, / Γ(/) where is the shape parameter ad Γ is the gamma fuctio. The formula for the gamma fuctio is defied above. I a testig cotext, the chi-square distributio is treated as a stadardized distributio (i.e., o locatio or scale parameters). However, i a distributioal modelig cotext (as with other probability distributios), the chi-square distributio itself ca be trasformed with a locatio parameter, µ, ad a scale parameter, sigma. Properties (1) µ Eχ () Media /3 for large, (3) Mode, (4) Stadard Deviatio σ(χ ), (5) Coefficiet of Variatio: c v µ σ /. Sice the chi-square distributio is typically used to develop hypothesis tests ad cofidece itervals ad rarely for modelig applicatios, we omit ay discussio of parameter estimatio. Commets The chisquare distributio is used i may cases for the critical regios for hypothesis tests ad i determiig cofidece itervals. Two commo examples are the chi-square test for idepedece i a RxC cotigecy table ad the chi-square test to determie if the stadard deviatio of a populatio is equal to a pre-specified value. γ( F (x), x ) Γ( ) where x 0, where Γ the gamma fuctio defied above ad γ is the icomplete gamma fuctio. The formula for the icomplete gamma fuctio is F Distributio γ(a, x) x 0 t a 1 e t dt A statistic F,m that is the ratio of two statistics with chi-square distributios with degrees of freedom ad m, respectively, where each chi-square statistic has first bee divided by its degrees of freedom, i.e., F,m χ / χ m /m,

10 MICHAEL N. KATEHAKIS has the F distributio with degrees of freedom ad m, o domai [0, ). The formula for the probability desity fuctio f (, m)(x) of F,m is give by Γ( + m ) / m m/ f,m (x) Γ( x/ 1 )Γ(m ), where x 0, (m + x) (+m)/ ad ad m are the shape parameters. I a testig cotext, the F distributio is also treated as a stadardized distributio (i.e., o locatio or scale parameters). Properties (1) E(F,m ) m m, () V ar(f,m ) (m) ( + m ) (m ) (m 4). Refereces 1. Averill M. Law ad W. David Kelto (1999). Simulatio Modelig ad Aalysis (3rd ed). McGraw-Hill Higher Educatio, New York, NY.. Mauel Lagua ad Joha Marklud (004). Busiess Process Modelig, Simulatio, ad Desig, Pretice Hall, Eglewood Cliffs, 3. NJ. Abramowitz, M. ad Stegu, I. A. (Eds.) (197).. Hadbook of Mathematical Fuctios with Formulas, Graphs, ad Mathematical Tables, 9th pritig. New York: Dover, pp. 946-949. 4. David, F. N. (1949). The Momets of the z ad F Distributios. Biometrika 36, 394-403. 5. Press, W. H.; Flaery, B. P.; Teukolsky, S. A.; ad Vetterlig, W. T. (199). Icomplete Beta Fuctio, Studet s Distributio, F-Distributio, Cumulative Biomial Distributio. 6. i Numerical Recipes i FORTRAN: The Art of Scietific Computig, d ed. Cambridge, Eglad: Cambridge Uiversity Press, pp. 19-3. 6. Spiegel, M. R. (199). Theory ad Problems of Probability ad Statistics. New York: McGraw-Hill, pp. 117-118. 7. http://www.statsoft.com/textbook/sttable.html, 006. Departmet of Maagemet Sciece ad Iformatio Systems, Rutgers Busiess School, Newark ad New Bruswick, 180 Uiversity Aveue, Newark, NJ 0710-1895 E-mail address: mk@rci.rutgers.edu