Chapter 7: Point Estimation and Sampling Distributions

Similar documents
Chapter 5: Statistical Inference (in General)

Unit 5: Sampling Distributions of Statistics

Unit 5: Sampling Distributions of Statistics

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Chapter 7 - Lecture 1 General concepts and criteria

STAT 241/251 - Chapter 7: Central Limit Theorem

STAT Chapter 7: Central Limit Theorem

Business Statistics 41000: Probability 4

Elementary Statistics Lecture 5

6 Central Limit Theorem. (Chs 6.4, 6.5)

Chapter 7 Sampling Distributions and Point Estimation of Parameters

Point Estimators. STATISTICS Lecture no. 10. Department of Econometrics FEM UO Brno office 69a, tel

Statistical analysis and bootstrapping

The normal distribution is a theoretical model derived mathematically and not empirically.

Chapter 5. Statistical inference for Parametric Models

Lecture 9 - Sampling Distributions and the CLT

Confidence Intervals Introduction

8.1 Estimation of the Mean and Proportion

1. Covariance between two variables X and Y is denoted by Cov(X, Y) and defined by. Cov(X, Y ) = E(X E(X))(Y E(Y ))

Chapter 8. Introduction to Statistical Inference

Chapter 5. Sampling Distributions

MATH 3200 Exam 3 Dr. Syring

Chapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

Chapter 7. Sampling Distributions and the Central Limit Theorem

Review of the Topics for Midterm I

BIO5312 Biostatistics Lecture 5: Estimations

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER

Chapter 7. Sampling Distributions and the Central Limit Theorem

Probability Theory and Simulation Methods. April 9th, Lecture 20: Special distributions

Section The Sampling Distribution of a Sample Mean

STAT 509: Statistics for Engineers Dr. Dewei Wang. Copyright 2014 John Wiley & Sons, Inc. All rights reserved.

IEOR 3106: Introduction to OR: Stochastic Models. Fall 2013, Professor Whitt. Class Lecture Notes: Tuesday, September 10.

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS

Interval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems

Lecture 23. STAT 225 Introduction to Probability Models April 4, Whitney Huang Purdue University. Normal approximation to Binomial

Introduction to Statistics I

Chapter 3 Discrete Random Variables and Probability Distributions

Applied Statistics I

Binomial Random Variables. Binomial Random Variables

Introduction to Statistical Data Analysis II

Sampling and sampling distribution

Chapter 8: Sampling distributions of estimators Sections

MVE051/MSG Lecture 7

Chapter 9: Sampling Distributions

Lecture 9 - Sampling Distributions and the CLT. Mean. Margin of error. Sta102/BME102. February 6, Sample mean ( X ): x i

Probability Theory. Mohamed I. Riffi. Islamic University of Gaza

STAT Chapter 6: Sampling Distributions

Statistics, Their Distributions, and the Central Limit Theorem

Point Estimation. Principle of Unbiased Estimation. When choosing among several different estimators of θ, select one that is unbiased.

Normal distribution Approximating binomial distribution by normal 2.10 Central Limit Theorem

BIOL The Normal Distribution and the Central Limit Theorem

Chapter 3 Statistical Quality Control, 7th Edition by Douglas C. Montgomery. Copyright (c) 2013 John Wiley & Sons, Inc.

4-1. Chapter 4. Commonly Used Distributions by The McGraw-Hill Companies, Inc. All rights reserved.

Chapter 3 Discrete Random Variables and Probability Distributions

Sampling Distributions

Stat 213: Intro to Statistics 9 Central Limit Theorem

As you draw random samples of size n, as n increases, the sample means tend to be normally distributed.

CHAPTER 5 SAMPLING DISTRIBUTIONS

Sampling Distributions and the Central Limit Theorem

درس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی

Sampling Distributions

UNIVERSITY OF VICTORIA Midterm June 2014 Solutions

Central Limit Theorem (cont d) 7/28/2006

Tutorial 11: Limit Theorems. Baoxiang Wang & Yihan Zhang bxwang, April 10, 2017

Midterm Exam III Review

Much of what appears here comes from ideas presented in the book:

MATH 264 Problem Homework I

4.3 Normal distribution

ECE 340 Probabilistic Methods in Engineering M/W 3-4:15. Lecture 10: Continuous RV Families. Prof. Vince Calhoun

Statistics and Probability

1. Statistical problems - a) Distribution is known. b) Distribution is unknown.

Sampling. Marc H. Mehlman University of New Haven. Marc Mehlman (University of New Haven) Sampling 1 / 20.

Tutorial 6. Sampling Distribution. ENGG2450A Tutors. 27 February The Chinese University of Hong Kong 1/6

Normal Distribution. Notes. Normal Distribution. Standard Normal. Sums of Normal Random Variables. Normal. approximation of Binomial.

Data Analysis and Statistical Methods Statistics 651

Computer Statistics with R

Probability is the tool used for anticipating what the distribution of data should look like under a given model.

UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions.

5.3 Statistics and Their Distributions

4.1 Introduction Estimating a population mean The problem with estimating a population mean with a sample mean: an example...

Lecture 10: Point Estimation

Bernoulli and Binomial Distributions

Review for Final Exam Spring 2014 Jeremy Orloff and Jonathan Bloom

Statistical estimation

Statistics 431 Spring 2007 P. Shaman. Preliminaries

Bayesian Normal Stuff

Chapter 7 presents the beginning of inferential statistics. The two major activities of inferential statistics are

Statistics 251: Statistical Methods Sampling Distributions Module

Chapter 6: Point Estimation

4-2 Probability Distributions and Probability Density Functions. Figure 4-2 Probability determined from the area under f(x).

Lecture Notes 6. Assume F belongs to a family of distributions, (e.g. F is Normal), indexed by some parameter θ.

Shifting our focus. We were studying statistics (data, displays, sampling...) The next few lectures focus on probability (randomness) Why?

Lecture Stat 302 Introduction to Probability - Slides 15

Definition 9.1 A point estimate is any function T (X 1,..., X n ) of a random sample. We often write an estimator of the parameter θ as ˆθ.

Statistics and Their Distributions

Module 4: Probability

2011 Pearson Education, Inc

Some Discrete Distribution Families

STA258H5. Al Nosedal and Alison Weir. Winter Al Nosedal and Alison Weir STA258H5 Winter / 41

Transcription:

Chapter 7: Point Estimation and Sampling Distributions Seungchul Baek Department of Statistics, University of South Carolina STAT 509: Statistics for Engineers 1 / 20

Motivation In chapter 3, we learned the discrete probability distributions, including Bernoulli, Binomial, Geometric, Negative Binomial, Hypergeometric, and Poisson. In chapter 4, we learned the continuous probability distributions, including Exponential, Weibull, and Normal. In chapter 3 and 4, we always assume that we know the parameter of the distribution. For example, we know the mean µ and variance σ 2 for a normal distributed random variable, so that we can calculate all kinds of probabilities with them. 2 / 20

Motivation For example, suppose we know the height of 18-year-old US male follows N(µ = 176.4, σ 2 = 9) in centimeters. Let Y = the height of one 18-year-old US male. We can calculate P(Y > 180) = 0.115 by R: 1-pnorm(180, 176.4, 3). However, it is natural that we do NOT know the population mean µ and population variance σ 2 in reality. What should we do? Statistical inference deals with making (probabilistic) statements about a population of individuals based on information that is contained in a sample taken from the population. 3 / 20

Terminology: population/sample A population refers to the entire group of individuals (e.g., people, parts, batteries, etc.) about which we would like to make a statement (e.g., height probability, median weight, defective proportion, mean lifetime, etc.). Problem: Population can not be measured (generally) Solution: We observe a sample of individuals from the population to make a decision i.e. statistical inference. We denote a random sample of observations by n is the sample size Y 1, Y 2,..., Y n Denote y 1, y 2,..., y n to be one realization. 4 / 20

Terminology: random sample The The random variables Y 1,..., Y n are a random sample of size n if (i) the Y i s are independent random variable and (ii) every Y i has the same probability distribution, which is called i.i.d. 5 / 20

Example BATTERY DATA: Consider the following random sample of n = 50 battery lifetimes y 1, y 2,..., y 50 (measured in hours): 4285 2066 2584 1009 318 1429 981 1402 1137 414 564 604 14 4152 737 852 1560 1786 520 396 1278 209 349 478 3032 1461 701 1406 261 83 205 602 3770 726 3894 2662 497 35 2778 1379 3920 1379 99 510 582 308 3367 99 373 454 6 / 20

A histogram of battery lifetime data 7 / 20

Cont d on battery lifetime data The (empirical) distribution of the battery lifetimes is skewed to the right. Which continuous probability distribution seems to display the same type of pattern that we see in histogram? An exponential(λ) models seems reasonable here (based in the histogram shape). What is λ? In this example, λ is called a (population) parameter (generally unknown). It describes the theoretical distribution which is used to model the entire population of battery lifetimes. 8 / 20

Terminology: parameter A parameter is a numerical summary that describes a population. In general, population parameters are unknown. Some very common examples are: µ: population mean σ 2 : population variance σ: population standard deviation p: population proportion Connection: all of the probability distributions that we talked about in previous chapter are indexed by population parameters. 9 / 20

Terminology: statistic A statistic is a numerical summary that can be calculated from a sample. Suppose Y 1, Y 2,..., Y n is a random sample from a population, some very common examples are: sample mean: sample variance: Y = 1 n n i=1 Y i s 2 = 1 n 1 n (Y i Y ) 2 i=1 sample standard deviation: s = s 2 sample proportion: p = 1 n i=1 n Y i, if Y i s are binary. 10 / 20

Back to battery lifetime data With the battery lifetime data (a random sample of n = 50 lifetimes), R code: y = 1274.14 hours s 2 = 1505156 (hours) 2 s 1226.85 hours > mean(battery) ## sample mean [1] 1274.14 > var(battery) ## sample variance [1] 1505156 > sd(battery) ## sample standard deviation [1] 1226.848 11 / 20

Parameters and Statistics Cont d SUMMARY: The table below succinctly summarizes differences between a population and a sample (a parameter and a statistic): Comparison between parameters and statistics Statistics Parameters describes a sample describes a population always known usually unknown random fixed ex: X, s 2, s ex: µ, σ 2, σ 12 / 20

Statistical Inference Statistical inference deals with making (probabilistic) statements about a population of individuals based on information that is contained in a sample taken from the population. We do this by estimating unknown population parameters with sample statistics. quantifying the uncertainty (variability) that arises in the estimation process. 13 / 20

Point estimators and sampling distributions Let θ denote a population parameter. A point estimator ˆθ is a statistic that is used to estimate a population parameter θ. Common examples of point estimators are: θ = Y a point estimator for θ = µ θ = s 2 a point estimator for θ = σ 2 θ = s a point estimator for θ = σ Remark: In general, θ is a statistic, the value of θ will vary from sample to sample. Why? A statistic is a function of r.v. 14 / 20

Terminology: sampling distribution The distribution of a statistic is called a sampling distribution. A sampling distribution describes mathematically how a statistic would vary in repeated sampling. What is a good estimator? And good in what sense? 15 / 20

Evaluate an estimator Accuracy: We say that θ is an unbiased estimator of θ if and only if E( θ) = θ Note: If the estimator is not unbiased, then difference E( θ) θ is called the bias of the estimator θ. RESULT: When Y 1,..., Y n is a random sample, E(Y ) = µ E(s 2 ) = σ 2 Precision: Suppose that θ 1 and θ 2 are unbiased estimators of θ. We would like to pick the estimator with smaller variance, since it is more likely to produce an estimate close to the true value θ. 16 / 20

Evaluate an estimator: cont d SUMMARY: We desire point estimators θ which are unbiased (perfectly accurate) and have small variance (highly precise). TERMINOLOGY: The standard error of a point estimator θ is equal to se( θ) = var( θ). Note: smaller se( θ) θ more precise. 17 / 20

Evaluate an estimator: cont d Which estimator is better? Why? 18 / 20

Central Limit Theorem THE MOST IMPORTANT THEOREM IN STATISTICS! Central Limit Theorem: Suppose that Y 1, Y 2,..., Y n is a random sample from a population distribution with mean µ and variance σ 2. When the sample size n is large, we have ) Y AN (µ, σ2 n AN is read as Asymptotically Normal. It holds in a limiting sense, i.e. when n. 19 / 20

Simulation Study of CLT Cont d 20 / 20