Chapter 8: Sampling distributions of estimators Sections

Similar documents
Chapter 8: Sampling distributions of estimators Sections

Chapter 7: Estimation Sections

Chapter 7: Estimation Sections

Chapter 7: Estimation Sections

Definition 9.1 A point estimate is any function T (X 1,..., X n ) of a random sample. We often write an estimator of the parameter θ as ˆθ.

Applied Statistics I

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER

Statistical analysis and bootstrapping

Confidence Intervals Introduction

Interval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems

Point Estimation. Principle of Unbiased Estimation. When choosing among several different estimators of θ, select one that is unbiased.

Lecture Notes 6. Assume F belongs to a family of distributions, (e.g. F is Normal), indexed by some parameter θ.

Statistics for Business and Economics

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Chapter 7 - Lecture 1 General concepts and criteria

Chapter 7: Point Estimation and Sampling Distributions

Unit 5: Sampling Distributions of Statistics

Unit 5: Sampling Distributions of Statistics

Point Estimation. Edwin Leuven

Non-informative Priors Multiparameter Models

MATH 3200 Exam 3 Dr. Syring

Chapter 8. Introduction to Statistical Inference

Lecture 10: Point Estimation

Chapter 5. Statistical inference for Parametric Models

Chapter 4: Asymptotic Properties of MLE (Part 3)

STRESS-STRENGTH RELIABILITY ESTIMATION

Back to estimators...

Bayesian Normal Stuff

درس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی

Posterior Inference. , where should we start? Consider the following computational procedure: 1. draw samples. 2. convert. 3. compute properties

Exercise. Show the corrected sample variance is an unbiased estimator of population variance. S 2 = n i=1 (X i X ) 2 n 1. Exercise Estimation

Computer Statistics with R

Hardy Weinberg Model- 6 Genotypes

UNIVERSITY OF VICTORIA Midterm June 2014 Solutions

CS340 Machine learning Bayesian model selection

Multi-armed bandit problems

Review of key points about estimators

A New Hybrid Estimation Method for the Generalized Pareto Distribution

Lecture 9 - Sampling Distributions and the CLT

5.3 Interval Estimation

Practice Exercises for Midterm Exam ST Statistical Theory - II The ACTUAL exam will consists of less number of problems.

Actuarial Mathematics and Statistics Statistics 5 Part 2: Statistical Inference Tutorial Problems

Exam 2 Spring 2015 Statistics for Applications 4/9/2015

Point Estimators. STATISTICS Lecture no. 10. Department of Econometrics FEM UO Brno office 69a, tel

Chapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi

12 The Bootstrap and why it works

4-1. Chapter 4. Commonly Used Distributions by The McGraw-Hill Companies, Inc. All rights reserved.

continuous rv Note for a legitimate pdf, we have f (x) 0 and f (x)dx = 1. For a continuous rv, P(X = c) = c f (x)dx = 0, hence

Business Statistics 41000: Probability 3

Chapter 14 : Statistical Inference 1. Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same.

Estimation after Model Selection

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

1. Covariance between two variables X and Y is denoted by Cov(X, Y) and defined by. Cov(X, Y ) = E(X E(X))(Y E(Y ))

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

Lecture 2. Probability Distributions Theophanis Tsandilas

BIO5312 Biostatistics Lecture 5: Estimations

χ 2 distributions and confidence intervals for population variance

INSTITUTE AND FACULTY OF ACTUARIES. Curriculum 2019 SPECIMEN EXAMINATION

1 Introduction 1. 3 Confidence interval for proportion p 6

STAT 509: Statistics for Engineers Dr. Dewei Wang. Copyright 2014 John Wiley & Sons, Inc. All rights reserved.

ST440/550: Applied Bayesian Analysis. (5) Multi-parameter models - Summarizing the posterior

Review of key points about estimators

Sampling Distribution

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals

Lecture 23. STAT 225 Introduction to Probability Models April 4, Whitney Huang Purdue University. Normal approximation to Binomial

Tutorial 11: Limit Theorems. Baoxiang Wang & Yihan Zhang bxwang, April 10, 2017

Chapter 5: Statistical Inference (in General)

Contents. 1 Introduction. Math 321 Chapter 5 Confidence Intervals. 1 Introduction 1

Statistical Intervals. Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Lecture 22. Survey Sampling: an Overview

Review for Final Exam Spring 2014 Jeremy Orloff and Jonathan Bloom

Much of what appears here comes from ideas presented in the book:

STAT/MATH 395 PROBABILITY II

CS340 Machine learning Bayesian statistics 3

MTH6154 Financial Mathematics I Stochastic Interest Rates

Generating Random Numbers

EVA Tutorial #1 BLOCK MAXIMA APPROACH IN HYDROLOGIC/CLIMATE APPLICATIONS. Rick Katz

Statistical Intervals (One sample) (Chs )

10/1/2012. PSY 511: Advanced Statistics for Psychological and Behavioral Research 1

1. Statistical problems - a) Distribution is known. b) Distribution is unknown.

Chapter 7. Sampling Distributions and the Central Limit Theorem

CPSC 540: Machine Learning

Sampling and sampling distribution

**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:

4.1 Introduction Estimating a population mean The problem with estimating a population mean with a sample mean: an example...

CPSC 540: Machine Learning

Simulation Wrap-up, Statistics COS 323

may be of interest. That is, the average difference between the estimator and the truth. Estimators with Bias(ˆθ) = 0 are called unbiased.

Financial Risk Forecasting Chapter 9 Extreme Value Theory

IEOR E4703: Monte-Carlo Simulation

Bayesian Linear Model: Gory Details

Dealing with forecast uncertainty in inventory models

Estimation of a parametric function associated with the lognormal distribution 1

An Improved Skewness Measure

Section 2.4. Properties of point estimators 135

Lecture 9 - Sampling Distributions and the CLT. Mean. Margin of error. Sta102/BME102. February 6, Sample mean ( X ): x i

STAT 425: Introduction to Bayesian Analysis

Objective Bayesian Analysis for Heteroscedastic Regression

Machine Learning for Quantitative Finance

Practice Exam 1. Loss Amount Number of Losses

Transcription:

Chapter 8 continued Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square distributions 8.3 Joint Distribution of the sample mean and sample variance Skip: p. 476-478 8.4 The t distributions Skip: derivation of the pdf, p. 483-484 8.5 Confidence intervals 8.6 Bayesian Analysis of Samples from a Normal Distribution 8.7 Unbiased Estimators 8.8 Fisher Information Sampling Distributions 1 / 30

Review from Sections 8.1-8.4 Review from Sections 8.1-8.4 Chi-square distribution: χ 2 m, same as Gamma(α = m/2, β = 1/2) The t m distribution: If Y χ 2 m and Z N(0, 1) are independent Z then t m. Y /m Let X 1,..., X n be a random sample from N(µ, σ 2 ) If µ is known but σ is not: n σ 2 0 σ 2 χ 2 n where If both (µ, σ) are unknown: σ 2 0 = 1 n n σ 2 S n χ 2 n 1 where S n = 1 n n(x n µ) σ t n 1 where σ = n (X i µ) 2 i=1 n (X i X n ) 2 i=1 [ n i=1 (X i X n ) 2 n 1 ] 1/2 Sampling Distributions 2 / 30

8.5 Confidence intervals Confidence Interval A frequentist tool Say we want to estimate θ, or in general g(θ) We also want to know how good that estimate is. Def: Confidence Interval (CI) Let X 1,..., X n be a random sample from f (x θ), where θ is unknown (but not random). Let g(θ) be a real-valued function and let A and B be statistics where P (A < g(θ) < B) γ θ. The random interval (A, B) is called a 100γ% confidence interval for g(θ). If =, the CI is exact. After the random variables X 1,..., X n have been observed and the values of A = a and B = b have been computed, the interval (a, b) is called the observed confidence interval. Sampling Distributions 3 / 30

8.5 Confidence intervals Confidence Interval - Mean of a Normal Distribution Last time we saw the following example Let X 1,..., X n be a random sample from N(µ, σ 2 ) Let ( X n = 1 n n X i and σ i=1 = (X i X n ) 2 n n 1 i=1 Then we know that U = n(x n µ) σ ) 1/2 has the t n 1 distribution. We can therefore calculate γ = P( c < U < c). Turning this around, we get ) γ = P (X n c σ < µ < X n + c σ n n Sampling Distributions 4 / 30

8.5 Confidence intervals Confidence Interval - Mean of a Normal Distribution Let T m (x) denote the cdf of the t m distribution. Given γ we can find c so that P( c < U < c) = γ: γ = P( c < U < c) = 2T n 1 (c) 1 since the t distribution is symmetric around 0. Solving for c we get ( ) γ + 1 c = T 1 n 1 2 where T 1 n 1 is the quantile function for the t n 1 distribution. So a 100γ% confidence interval for µ is ( X n T 1 n 1 ( γ + 1 2 ) σ n, X n + T 1 n 1 ( γ + 1 2 ) σ n ) Sampling Distributions 5 / 30

Example Hotdogs Exercise 8.5.7 in the book 8.5 Confidence intervals Data on calorie content in 20 different beef hot dogs from Consumer Reports (June 1986 issue): 186, 181, 176, 149, 184, 190, 158, 139, 175, 148, 152, 111, 141, 153, 190, 157, 131, 149, 135, 132 Assume that these numbers are observed values from a random sample of twenty independent N(µ, σ 2 ) random variables, where µ and σ 2 are unknown. Observed sample mean and σ are X n = 156.85 and σ = 22.64201 Find a 95% confidence interval for µ Sampling Distributions 6 / 30

8.5 Confidence intervals Interpretation of a confidence interval Confidence intervals are a Frequentist tool We know that ( ( ) γ + 1 σ P X n T 1 n 1 < µ < X n + T 1 2 n n 1 ( γ + 1 After observing the data we observe the random interval 2 ) σ n ) = γ For example: (146.25, 167.45) is an observed 95% confidence interval for µ That does NOT mean that P(146.25 < µ < 167.45) = 0.95. For this statement to make sense we need Bayesian thinking and Bayesian methods. Sampling Distributions 7 / 30

8.5 Confidence intervals Interpretation of a confidence interval Confidence intervals are a Frequentist tool One way of thinking of this: Repeated samples. Take a random sample of size n from N(µ, σ 2 ) and calculate the 95% confidence interval Take another random sample (of the same size n) and do the same calculations. Repeat. Many times. Since there is a 95% chance that the random intervals cover the value of µ we expect 95% of the intervals to cover the actual value of µ Problem: We never take more than one sample! Sampling Distributions 8 / 30

8.5 Confidence intervals Properties of a confidence interval - Simulation Study I simulated n=20 r.v. from N(8, 2 2 ) and calculated the 95% CI I repeated that 100 times 4 of the 100 intervals do not cover µ = 8 (red intervals) Sampling Distributions 9 / 30

8.5 Confidence intervals Non-symmetric confidence intervals Mean of the normal distribution More generally we want to find P(c 1 < U < c 2 ) = γ Symmetric confidence interval: Equal probability on either side: P(U c 1 ) = P(U c 2 ) = 1 γ 2 Since the distribution of U is symmetric around 0, the shortest possible for µ is the symmetric confidence interval. One-sided confidence interval: All the extra probability is on one side. That is, either c 1 = or c 2 = Sampling Distributions 10 / 30

8.5 Confidence intervals One-sided Confidence Interval Def: Lower bound Let A be a statistic so that P(A < g(θ)) γ θ The random interval (A, ) is a one-sided 100γ% confidence interval for g(θ) A is a 100γ% lower confidence limit for g(θ) Sampling Distributions 11 / 30

8.5 Confidence intervals One-sided Confidence Interval Def: Upper bound Let B be a statistic so that P(g(θ) < B) γ θ The random interval (, B) is a one-sided 100γ% confidence interval for g(θ). B is a 100γ% upper confidence limit for g(θ) Sampling Distributions 12 / 30

8.5 Confidence intervals One-sided Confidence Interval - Mean of a normal Let X 1,..., X n be a random sample from N(µ, σ 2 ), both µ and σ 2 unknown. Find the one-sided 100γ% confidence intervals for µ Find the observed 95% upper confidence limit for µ for the hotdog example. Sampling Distributions 13 / 30

8.5 Confidence intervals Confidence intervals for other distributions Def: Pivotal Let X = (X 1,..., X n ) be a random sample from a distribution that depends on parameter θ. Let V (X, θ) be a random variable whose distribution is the same for all θ. Then V is called a pivotal quantity. To use this we need to be able to invert the pivotal relationship: find a function r(v, x) so that r(v (X, θ), X) = g(θ). If the function r is increasing in v for every x, V has a continuous distribution with cdf F(v) and γ 2 γ 1 = γ, then ( ) ( ) A = r F 1 (γ 1 ), X and B = r F 1 (γ 2 ), X are the endpoints of an exact 100γ% confidence interval (Theorem 8.5.3). Sampling Distributions 14 / 30

8.5 Confidence intervals Confidence interval using Pivotal quantities Example: The rate parameter θ of the exponential distribution X 1,..., X n i.i.d. Expo(θ) Find the γ% upper confidence limit for θ Find a symmetric γ% confidence interval for θ Example: Variance of the normal distribution X 1,..., X n i.i.d. N(µ, σ 2 ), both unknown. Find a symmetric γ% confidence interval for σ 2 Find the observed symmetric γ% confidence interval for σ 2 for the hotdog example Sampling Distributions 15 / 30

8.5 Confidence intervals Problems with interpretation of a confidence interval Example 8.5.11 is an interesting example. Say X 1, X 2 are i.i.d. Uniform(θ 0.5, θ + 0.5) Let Y 1 = min(x 1, X 2 ) and Y 2 = max(x 1, X 2 ). Then (Y 1, Y 2 ) is a 50% confidence interval for θ However: If we observe Y 1 and Y 2 that are more than 0.5 apart, that is y 2 y 1 > 0.5 then we know for a certainty that (y 1, y 2 ) contains θ! Yet we only assign 50% confidence to that interval, which ignores information we have. Sampling Distributions 16 / 30

Chapter 8 continued Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square distributions 8.3 Joint Distribution of the sample mean and sample variance Skip: p. 476-478 8.4 The t distributions Skip: derivation of the pdf, p. 483-484 8.5 Confidence intervals 8.6 Bayesian Analysis of Samples from a Normal Distribution 8.7 Unbiased Estimators 8.8 Fisher Information Sampling Distributions 17 / 30

Unbiased Estimators 8.7 Unbiased Estimators Suppose that we use an estimator δ(x) to estimate the parameter g(θ) Properties of an estimator (so far): consistency and sufficiency Another property of an estimator: unbiasedness Def: Unbiased Estimator / Bias An estimator δ(x) is an unbiased estimator of g(θ) if E(δ(X)) = g(θ) θ. Otherwise it is called a biased estimator. The bias is defined as E(δ(X)) g(θ) Sampling Distributions 18 / 30

8.7 Unbiased Estimators Examples X 1,..., X n i.i.d. N(µ, σ 2 ). X n is an unbiased estimator of µ since E(X n ) = µ for all µ Unbiased estimators of mean and variance from any distribution: Let X 1,..., X n be a random sample from f (x θ). The mean and variance of the distribution (if exist) are functions of θ. X n is an unbiased estimator of the mean E(X 1 ) Theorem 8.7.1: If variance is finite then ˆσ 1 2 is an unbiased estimator of Var(X) where ˆσ 2 1 = 1 n 1 n (X i X n ) 2 Note: This means that the MLE of σ 2 in N(µ, σ 2 ) is a biased estimator i=1 Sampling Distributions 19 / 30

8.7 Unbiased Estimators Mean Square Error (MSE) Is unbiased good enough? Useless if the estimator has high variance Look for unbiased estimators with lowest variance Mean squared error: E ( (δ(x) g(θ)) 2) Want estimators with small MSE. Corollary 8.7.1 Let δ(x) be an estimator with finite variance. Then MSE(δ(X)) = Var(δ(X)) + bias(δ(x)) 2 the MSE of an unbiased estimator is equal to its variance. Searching for unbiased estimator with small variance is equivalent to searching for unbiased estimators with small MSE. Sampling Distributions 20 / 30

8.7 Unbiased Estimators Example Let X 1,..., X n be a random sample from N(µ, σ 2 ) (both µ and σ 2 are unknown). Consider two estimators of σ 2 δ 1 = S n (the MLE of σ 2 ) δ 2 = ˆσ 2 1 (unbiased) Find the MSE of each estimator. Which estimator has smaller MSE? Which estimator do you prefer? Sampling Distributions 21 / 30

8.7 Unbiased Estimators Why unbiased? Sounds good - who wants to be biased? However, the variance or MSE are better evaluators of quality of estimators In many cases there exist biased estimators with smaller MSE Sampling Distributions 22 / 30

8.8 Fisher Information Let the pdf of X be f (x θ) The Fisher information I(θ) in the random variable X is defined as { [d ] } log f (X θ) 2 I(θ) = E dθ Under mild conditions, we have (Theorem 8.8.1) [ ] [ d log f (X θ) d 2 ] log f (X θ) I(θ) = Var = E dθ dθ 2 For a random sample X 1,..., X n, the Fisher information I n (θ) satisfies that I n (θ) = ni(θ) Sampling Distributions 23 / 30

8.8 Fisher Information Cramér-Rao Inequality Let X 1,..., X n be a random sample from a distribution for which the pdf is f (x θ). For any statistic T, let m(θ) = E(T ). Then under mild conditions, we have Var(T ) [m (θ)] 2 ni(θ). (Corollary 8.8.1) If T is unbiased estimator of θ, then Var(T ) 1 ni(θ). Efficient estimator of its expectation: if an estimator achieves the lower bound in Cramér-Rao Inequality. Example: X 1,..., X n is a random sample from Poisson(θ). Show that the MLE is an efficient estimator of θ. Sampling Distributions 24 / 30

8.8 Fisher Information Asymptotic Distributions of MLE Theorem 8.8.5 Let ˆθ n be the MLE of θ, then under mild conditions, we have [ni(θ)] 1/2 (ˆθ n θ) d N(0, 1). MLE is asymptotically efficient Sampling Distributions 25 / 30

8.8 Fisher Information Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square distributions 8.3 Joint Distribution of the sample mean and sample variance Skip: p. 476-478 8.4 The t distributions Skip: derivation of the pdf, p. 483-484 8.5 Confidence intervals 8.6 Bayesian Analysis of Samples from a Normal Distribution 8.7 Unbiased Estimators 8.8 Fisher Information Sampling Distributions 26 / 30

8.6 Bayesian Analysis of Samples from a Normal Distribution Bayesian alternative to confidence intervals Bayesian inference is based on the posterior distribution. Reporting a whole distribution may not be what you (or your client) want Point estimates: Bayesian estimators Minimize the expected loss Interval estimates: simply use quantiles of the posterior distribution For example: We can find constants c 1 and c 2 so that P(c 1 < θ < c 1 X = x) γ The interval (c 1, c 2 ) is called a 100γ% Credible interval for θ Note: The interpretation is very different from interpretation of confidence intervals Sampling Distributions 27 / 30

8.6 Bayesian Analysis of Samples from a Normal Distribution Example: the normal distribution Let X 1,..., X n be a random sample for N(µ, σ 2 ) In Chapter 7.3 we saw: If σ 2 is known, the normal distribution is a conjugate prior for µ Theorem 7.3.3: If the prior is µ N(µ 0, ν0 2 ) the posterior of µ is also normal with mean and variance µ 1 = σ2 µ 0 + nν 2 0 x n σ 2 + nν 2 0 and ν 2 1 = σ2 ν 2 0 σ 2 + nν 2 0 We can obtain credible intervals for µ from this N(µ 1, ν 2 1 ) posterior distribution Sampling Distributions 28 / 30

8.6 Bayesian Analysis of Samples from a Normal Distribution Example: the normal distribution What if both µ and σ 2 are unknown? Use the joint distribution of µ and σ 2 as the prior; Conjugate priors are available: the Normal-Inverse Gamma distribution; To give credible intervals for µ and σ 2 individually we need the marginal posterior distributions Sampling Distributions 29 / 30

8.6 Bayesian Analysis of Samples from a Normal Distribution END OF CHAPTER 8 Sampling Distributions 30 / 30