point estimator a random variable (like P or X) whose values are used to estimate a population parameter

Similar documents
Chapter 8. Confidence Interval Estimation. Copyright 2015, 2012, 2009 Pearson Education, Inc. Chapter 8, Slide 1

Statistics for Economics & Business

Inferential Statistics and Probability a Holistic Approach. Inference Process. Inference Process. Chapter 8 Slides. Maurice Geraghty,

Introduction to Probability and Statistics Chapter 7

Sampling Distributions and Estimation

Chapter 8 Interval Estimation. Estimation Concepts. General Form of a Confidence Interval

Topic-7. Large Sample Estimation

A point estimate is the value of a statistic that estimates the value of a parameter.

Sampling Distributions & Estimators

Estimating Proportions with Confidence

Chapter 8: Estimation of Mean & Proportion. Introduction

Standard Deviations for Normal Sampling Distributions are: For proportions For means _

CHAPTER 8 Estimating with Confidence

ii. Interval estimation:

Confidence Intervals Introduction

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

Lecture 4: Probability (continued)

Lecture 4: Parameter Estimation and Confidence Intervals. GENOME 560 Doug Fowler, GS

Math 124: Lecture for Week 10 of 17

Today: Finish Chapter 9 (Sections 9.6 to 9.8 and 9.9 Lesson 3)

. (The calculated sample mean is symbolized by x.)

Lecture 5: Sampling Distribution

BASIC STATISTICS ECOE 1323

CHAPTER 8: CONFIDENCE INTERVAL ESTIMATES for Means and Proportions

5. Best Unbiased Estimators

Sampling Distributions and Estimation

CHAPTER 8: CONFIDENCE INTERVAL ESTIMATES for Means and Proportions

14.30 Introduction to Statistical Methods in Economics Spring 2009

Basic formula for confidence intervals. Formulas for estimating population variance Normal Uniform Proportion

5 Statistical Inference

1. Suppose X is a variable that follows the normal distribution with known standard deviation σ = 0.3 but unknown mean µ.

B = A x z

Statistics for Business and Economics

Parametric Density Estimation: Maximum Likelihood Estimation

Chapter 10 - Lecture 2 The independent two sample t-test and. confidence interval

The Idea of a Confidence Interval

1 Random Variables and Key Statistics

18.S096 Problem Set 5 Fall 2013 Volatility Modeling Due Date: 10/29/2013

BIOSTATS 540 Fall Estimation Page 1 of 72. Unit 6. Estimation. Use at least twelve observations in constructing a confidence interval

Lecture 5 Point Es/mator and Sampling Distribu/on

r i = a i + b i f b i = Cov[r i, f] The only parameters to be estimated for this model are a i 's, b i 's, σe 2 i

CHAPTER 8 CONFIDENCE INTERVALS

A Bayesian perspective on estimating mean, variance, and standard-deviation from data

These characteristics are expressed in terms of statistical properties which are estimated from the sample data.

Research Article The Probability That a Measurement Falls within a Range of n Standard Deviations from an Estimate of the Mean

AY Term 2 Mock Examination

NOTES ON ESTIMATION AND CONFIDENCE INTERVALS. 1. Estimation

Exam 1 Spring 2015 Statistics for Applications 3/5/2015

Unbiased estimators Estimators

Control Charts for Mean under Shrinkage Technique

Maximum Empirical Likelihood Estimation (MELE)

1 Estimating the uncertainty attached to a sample mean: s 2 vs.

A random variable is a variable whose value is a numerical outcome of a random phenomenon.

ST 305: Exam 2 Fall 2014

Combining imperfect data, and an introduction to data assimilation Ross Bannister, NCEO, September 2010

Online appendices from Counterparty Risk and Credit Value Adjustment a continuing challenge for global financial markets by Jon Gregory

SCHOOL OF ACCOUNTING AND BUSINESS BSc. (APPLIED ACCOUNTING) GENERAL / SPECIAL DEGREE PROGRAMME

Chapter 17 Sampling Distribution Models

I. Measures of Central Tendency: -Allow us to summarize an entire data set with a single value (the midpoint).

1 Estimating sensitivities

Outline. Populations. Defs: A (finite) population is a (finite) set P of elements e. A variable is a function v : P IR. Population and Characteristics

1. Find the area under the standard normal curve between z = 0 and z = 3. (a) (b) (c) (d)

0.1 Valuation Formula:

ECON 5350 Class Notes Maximum Likelihood Estimation

Data Analysis and Statistical Methods Statistics 651

Exam 2. Instructor: Cynthia Rudin TA: Dimitrios Bisias. October 25, 2011

Confidence Intervals based on Absolute Deviation for Population Mean of a Positively Skewed Distribution

Topic 14: Maximum Likelihood Estimation

Introduction to Statistical Inference

Bayes Estimator for Coefficient of Variation and Inverse Coefficient of Variation for the Normal Distribution

Monetary Economics: Problem Set #5 Solutions

4.5 Generalized likelihood ratio test

Simulation Efficiency and an Introduction to Variance Reduction Methods

Parameter Uncertainty in Loss Ratio Distributions and its Implications

Non-Inferiority Logrank Tests

Chpt 5. Discrete Probability Distributions. 5-3 Mean, Variance, Standard Deviation, and Expectation

FINM6900 Finance Theory How Is Asymmetric Information Reflected in Asset Prices?

Chapter 10 Statistical Inference About Means and Proportions With Two Populations. Learning objectives


CAPITAL ASSET PRICING MODEL

x satisfying all regularity conditions. Then

The Limit of a Sequence (Brief Summary) 1

Models of Asset Pricing

Elementary Statistics and Inference. Elementary Statistics and Inference. Chapter 20 Chance Errors in Sampling (cont.) 22S:025 or 7P:025.

Forecasting bad debt losses using clustering algorithms and Markov chains

Department of Mathematics, S.R.K.R. Engineering College, Bhimavaram, A.P., India 2

Just Lucky? A Statistical Test for Option Backdating

Quantitative Analysis

An Empirical Study of the Behaviour of the Sample Kurtosis in Samples from Symmetric Stable Distributions

SUPPLEMENTAL MATERIAL

Appendix 1 to Chapter 5


Binomial Model. Stock Price Dynamics. The Key Idea Riskless Hedge

APPLICATION OF GEOMETRIC SEQUENCES AND SERIES: COMPOUND INTEREST AND ANNUITIES

An Improved Estimator of Population Variance using known Coefficient of Variation

of Asset Pricing R e = expected return

Models of Asset Pricing

Models of Asset Pricing

Systematic and Complex Sampling!

AMS Portfolio Theory and Capital Markets

Transcription:

Estimatio We have oted that the pollig problem which attempts to estimate the proportio p of Successes i some populatio ad the measuremet problem which attempts to estimate the mea value µ of some quatity withi a populatio are situatios we hadle by samplig data. To estimate a populatio proportio p, it makes sese to compute the sample proportio p; but sice it is wisest to choose radom samples, the computatio of this statistic is best regarded as a radom variable P. Similarly, to estimate the populatio mea µ, it makes sese to compute the sample proportio x; but sice it is wisest to choose radom samples, the computatio of this statistic is best regarded as a radom variable X. 1

poit estimator a radom variable (like P or X) whose values are used to estimate a populatio parameter poit estimate ay value of a poit estimator (like p or x) derived from a particular sample A poit estimator is said to be ubiased if its expected value is equal to the parameter beig estimated; efficiet if its stadard error is less tha that of other poit estimators; ad cosistet if, as the sample size gets larger, it more accurately estimates the parameter. 2

iterval estimator a rage of values which is used to estimate a populatio parameter cofidece iterval a iterval estimator which cotais the parameter beig estimated with a certai level of cofidece (i.e., with a high probability) margi of error half the width of a cofidece iterval; cofidece itervals are geerally costructed as cetered o some poit estimate, that is, they have the form Poit estimate ± Margi of error 3

Cofidece itervals for the populatio mea Recall that the samplig distributio of the sample mea X is ormally distributed provided that either we are samplig from a ormal populatio of X values, or the sample is large (o the order of 30). Uder this assumptio, the Empirical Rule tells us that (roughly) 95% of all sampled values of X will lie withi two stadard deviatios (SD( X) = σ/ ) of its expected value (E( X) = µ). That is, P (µ 2 σ X µ + 2 σ ) 0.95. We ca do a bit better tha this by replacig this approximate probability with a more accurate value: by ivertig the ormal probability distributio fuctio, we fid that 95% of values of a stadard ormal radom variable Z fall withi 1.96 stadard deviatios of the mea: P ( 1.96 Z 1.96) = 0.95. Thus, P (µ 1.96 σ X µ + 1.96 σ ) = 0.95. 4

Next, we observe that sice 95% of all sampled values of X will lie withi two stadard deviatios of the mea µ, the it is certaily true that the mea µ will lie withi two stadard deviatios of the sampled mea X. That is, P ( X 1.96 σ µ X + 1.96 σ ) = 0.95. This seemigly iocuous coclusio has far-reachig cosequeces: it states that 95% of all sampled values of X have the property that the ukow value of µ lies i the cofidece iterval X ± 1.96 σ. The choice of 95% i this statemet is merely coveiet; it is ot a ecessary feature of the cofidece iterval. We may replace it with ay suitably large level of cofidece. cofidece coefficiet, or level of sigificace (α) the probability that a cofidece iterval will ot cotai the parameter it iteds to estimate cofidece level (100(1 α)%) the percetage of sampled values of a poit estimate that produce cofidece itervals cotaiig the parameter they ited to estimate 5

Thus, the cofidece iterval above estimates µ at the α = 0.05 sigificace level ad has a 95% cofidece level. These choices both lead to the ormal tail probability z α/2 = 1.96 that appears i the cofidece iterval X ± 1.96 σ. More geerally, at ay sigificace level α, we obtai the followig: cofidece iterval for the mea assumig that we are samplig from a ormally distributed populatio of values of the radom variable X, or if the sample is large eough ( 30), the sampled mea value X = x leads to a cofidece iterval for µ at level of sigificace α give by x ± z α/2 σ. 6

Notice the three quatities that determie the margi of error: margi of error = z α/2 σ. The margi of error i a cofidece iterval estimate - is smaller for populatios with smaller stadard deviatio σ; - is smaller for samples of greater size ; - is smaller whe larger sigificace levels α are employed. However, sice σ is costat, the ivestigator has o meas to lower this parameter. Oly sample size ad sigificace level α ca be maipulated. A more serious practical cocer is maifest here: the cofidece iterval formula give above presumes that the ivestigator kows the value of σ. If the parameter µ is ukow, it is highly ulikely that σ will also be kow! That is, the ivestigator may be forced to estimate σ first i order to develop a estimate for µ. 7

Cofidece itervals ad the t df distributio There is a obvious solutio to this problem: estimate the stadard deviatio SD( X) = σ of the samplig distributio with the stadard error SE( X) = s. This replaces the ukow parameter σ with the kow sample statistic s. However, this estimatio carries with it its ow samplig variability, so whe we replace SD( X) with SE( X) i the cofidece iterval formula, the uderlyig samplig distributio o loger behaves eough like a ormal distributio. The additioal variability itroduced by estimatig σ with s causes more variability i the samplig distributio of X, producig a symmetric bell-shaped distributio, but with thicker tails tha the ormal distributio. The precise ature of this ew samplig distributio was discovered by W. S. Gossett i the early 1900s, who published his fidigs uder the pseudoym Studet. Thus, it is kow as the Studet t df distributio, or the t distributio with df degrees of freedom. Degrees of freedom is a additioal parameter that satisfies df = 1. 8

Properties of the t df distributio like the stadard ormal distributio, the t distributio is symmetric, asymptotic, ad bell-shaped, with mea equal to 0 ad stadard deviatio equal to 1; ulike the stadard ormal distributio, the t distributio has thicker tails ad a less promiet cetral peak; the smaller the df, the thicker the tails; as the umber of df icreases, the tails of the t distributio thi out ad its cetral promiece icreases i desity, approximatig more closely the stadard ormal desity fuctio. Whereas the samplig distributio of X with σ kow is ormal, so that the stadardized variable Z = X µ σ/ follows the stadard ormal distributio, the samplig distributio of X with σ ukow is stadardized so that the variable T = X µ s/ follows the Studet t df distributio with df degrees of freedom. 9

cofidece iterval for the mea (σ ukow) assumig that we are samplig from a ormally distributed populatio of values of the radom variable X, or if the sample is large eough ( 30), the sampled mea value X = x leads to a cofidece iterval for µ at level of sigificace α give by s x ± t α/2,df 10

Cofidece itervals for the populatio proportio Recall that the samplig distributio of the sample proportio P is, by the Cetral Limit Theorem, approximated well by a ormal distributio. This approximatio is valid provided that we are samplig radomly ad that the sample is large large eough so that it cotais at least 5 expected successes (that is, p 5) ad at least 5 expected failures (that is, (1 p) 5). Uder these assumptios, we coclude that P is ormally distributed, with expected value E( P ) = p (the populatio proportio) ad stadard deviatio give by the formula SD( P ) = p(1 p)/. It follows that if a radom sample from the populatio produces the value P = p, the ( ) p(1 p) p(1 p) P p 1.96 p p + 1.96 ad more geerally, that = 0.95, 11

P ( p z α/2 p(1 p) ) p(1 p) p p + z α/2 = 1 α, which ( ca be recast i the form ) p(1 p) p(1 p) P p z α/2 p p + z α/2 = 1 α. Cosequetly, i a etirely aalogous maer to how we developed a cofidece iterval for the mea i the case that σ is kow, we ca produce a cofidece iterval for the proportio p with the formula p(1 p) p ± z α/2. The difficulty with this is that the margi of error i this last formula depeds o kowig p; but as p is what we are tryig to estimate i the first place, we do ot kow its value! Istead, we replace the stadard deviatio SD( P ) = p(1 p)/ i this formula with the stadard error SE( P ) = p(1 p)/ 12

to obtai the cofidece iterval for the proportio p assumig that we compute a value of P from a sample of size radomly selected from a populatio with true proportio of Success equal to p, where the sample is large eough so that it cotais at least 5 expected successes (p 5) ad at least 5 expected failures ((1 p) 5), the a cofidece iterval for p at level of sigificace α is give by p(1 p) p ± z α/2. 13

Choosig the sample size We have observed that the size of the margi of error i a cofidece iterval estimate decreases as the size of the sample icreases; this is because the margi of error equals z α/2 σ i the case of estimatig a mea (with σ kow), ad z α/2 p(1 p) i the case of estimatig a proportio. We ca use these facts to help choose the sample size so as to achieve a desired margi of error D: simply set D equal to the appropriate quatity above ad solve the equatio for : choosig whe estimatig µ for a desired margi of error D, the miimum sample size required to estimate µ with a cofidece iterval at sigificace level α is = ( ) zα/2ˆσ 2 where ˆσ represets some reasoable estimate of σ (perhaps the value of s, or eve the crude estimate [rage/4] take from a pilot sample) 14 D

choosig whe estimatig p for a desired margi of error D, the miimum sample size required to estimate p with a cofidece iterval at sigificace level α is ( zα/2 ) 2 = ˆp(1 ˆp) D where ˆp represets some reasoable estimate of p (perhaps the value of p take from a pilot sample, or eve the coservative estimate ˆp = 50%) 15