Section The Sampling Distribution of a Sample Mean

Similar documents
Unit 5: Sampling Distributions of Statistics

Unit 5: Sampling Distributions of Statistics

STAT Chapter 7: Central Limit Theorem

BIOL The Normal Distribution and the Central Limit Theorem

Lecture 3. Sampling distributions. Counts, Proportions, and sample mean.

Chapter 7 Study Guide: The Central Limit Theorem

As you draw random samples of size n, as n increases, the sample means tend to be normally distributed.

1 Sampling Distributions

STAT 241/251 - Chapter 7: Central Limit Theorem

Chapter 7: Point Estimation and Sampling Distributions

Section Sampling Distributions for Counts and Proportions

Chapter 5. Sampling Distributions

4.2 Probability Distributions

7 THE CENTRAL LIMIT THEOREM

Sampling and sampling distribution

Business Statistics 41000: Probability 4

5.1 Sampling Distributions for Counts and Proportions. Ulrich Hoensch MAT210 Rocky Mountain College Billings, MT 59102

Chapter 7 Sampling Distributions and Point Estimation of Parameters

Elementary Statistics Lecture 5

χ 2 distributions and confidence intervals for population variance

Part V - Chance Variability

STA258H5. Al Nosedal and Alison Weir. Winter Al Nosedal and Alison Weir STA258H5 Winter / 41

Statistics for Managers Using Microsoft Excel 7 th Edition

Sampling Distributions

Chapter 5: Statistical Inference (in General)

Consider the following examples: ex: let X = tossing a coin three times and counting the number of heads

5.3 Statistics and Their Distributions

Midterm Exam III Review

Random Variables Handout. Xavier Vilà

STA215 Confidence Intervals for Proportions

Statistics 6 th Edition

Probability. An intro for calculus students P= Figure 1: A normal integral

MATH 3200 Exam 3 Dr. Syring

CHAPTER 5 Sampling Distributions

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Data Analysis and Statistical Methods Statistics 651

Sampling Distributions For Counts and Proportions

Chapter 14 : Statistical Inference 1. Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same.

Central Limit Theorem, Joint Distributions Spring 2018

Chapter 7. Sampling Distributions and the Central Limit Theorem

Discrete Random Variables

Discrete Random Variables

Expected Value of a Random Variable

ECON 214 Elements of Statistics for Economists 2016/2017

Confidence Intervals: Review

1. Variability in estimates and CLT

BIO5312 Biostatistics Lecture 5: Estimations

Chapter 7. Sampling Distributions and the Central Limit Theorem

MA 1125 Lecture 12 - Mean and Standard Deviation for the Binomial Distribution. Objectives: Mean and standard deviation for the binomial distribution.

Module 4: Probability

Statistics 251: Statistical Methods Sampling Distributions Module

Section 7.5 The Normal Distribution. Section 7.6 Application of the Normal Distribution

Statistics for Managers Using Microsoft Excel 7 th Edition

Figure 1: 2πσ is said to have a normal distribution with mean µ and standard deviation σ. This is also denoted

ECON 214 Elements of Statistics for Economists

CHAPTER 5 SAMPLING DISTRIBUTIONS

Standard Normal, Inverse Normal and Sampling Distributions

Statistical Methods in Practice STAT/MATH 3379

4 Random Variables and Distributions

The Normal Distribution

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

Data Analysis and Statistical Methods Statistics 651

Statistics, Their Distributions, and the Central Limit Theorem

Statistics, Measures of Central Tendency I

Chapter 7 presents the beginning of inferential statistics. The two major activities of inferential statistics are

Non-informative Priors Multiparameter Models

8.1 Estimation of the Mean and Proportion

INSTITUTE AND FACULTY OF ACTUARIES. Curriculum 2019 SPECIMEN EXAMINATION

Lecture 6: Chapter 6

continuous rv Note for a legitimate pdf, we have f (x) 0 and f (x)dx = 1. For a continuous rv, P(X = c) = c f (x)dx = 0, hence

Sampling Distributions

Sampling & Confidence Intervals

MAS187/AEF258. University of Newcastle upon Tyne

Some Discrete Distribution Families

Chapter Seven. The Normal Distribution

CHAPTER 6 Random Variables

2011 Pearson Education, Inc

Random Variable: Definition

Chapter 8: The Binomial and Geometric Distributions

Lecture 8 - Sampling Distributions and the CLT

Statistics 431 Spring 2007 P. Shaman. Preliminaries

Chapter 7: Random Variables

UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions.

= 0.35 (or ˆp = We have 20 independent trials, each with probability of success (heads) equal to 0.5, so X has a B(20, 0.5) distribution.

Sampling Distribution

STAT 111 Recitation 4

4.1 Introduction Estimating a population mean The problem with estimating a population mean with a sample mean: an example...

Confidence Intervals Introduction

Section 0: Introduction and Review of Basic Concepts

The Binomial Distribution

Sampling. Marc H. Mehlman University of New Haven. Marc Mehlman (University of New Haven) Sampling 1 / 20.

Review of key points about estimators

The Central Limit Theorem

Chapter 7: Sampling Distributions Chapter 7: Sampling Distributions

Statistics and Probability

Unit2: Probabilityanddistributions. 3. Normal and binomial distributions

The Assumption(s) of Normality

Statistics for Business and Economics: Random Variables:Continuous

INF FALL NATURAL LANGUAGE PROCESSING. Jan Tore Lønning, Lecture 3, 1.9

MATH 264 Problem Homework I

Transcription:

Section 5.2 - The Sampling Distribution of a Sample Mean Statistics 104 Autumn 2004 Copyright c 2004 by Mark E. Irwin

The Sampling Distribution of a Sample Mean Example: Quality control check of light bulbs Sample n light bulbs and look at the average failure time. Take another sample, and another, and so on. What is the mean of the sampling distribution? Variance? Standard Deviation? What is P [ n µ x ] or P [µ x c n µ x + c]? We will consider the situation where the observations are independent (at least approximately). In the case of finite populations, we want N n. Under this assumption, we can use the same idea as we did to get the mean and variance of a binomial distribution. Section 5.2 - The Sampling Distribution of a Sample Mean 1

= 1 n ( 1 + 2 +... + n ) µ x = 1 n (µ x + µ x +... + µ x ) }{{} n times σ 2 x = 1 n 2 (σ2 x + σ 2 x +... + σ 2 x) }{{} n times σ x = σ x n = µ x = σ2 x n So the sampling distribution of is centered at the same place as the observations, but less spread out as we should expect based on the law school example. The smaller spread also agrees with the law of large numbers, which says µ x as increases. Section 5.2 - The Sampling Distribution of a Sample Mean 2

Note the sampling distribution of ˆp is just a special case. average of n 0 s or 1 s. The formulas have the same form ˆp is just an µˆp = p σ 2ˆp = σˆp = p(1 p) n p(1 p) n Section 5.2 - The Sampling Distribution of a Sample Mean 3

Lets suppose that the life times of the light bulbs have a gamma distribution with µ x = 2 years and σ x = 1 year. Sample n bulbs and calculate sample average µ x = µ x = 2 σ x = σ x = 1 n n Lets see how the sampling distribution changes as n increases along the sequence n = 2, 10, 50, 100 We will examine this two ways: The exact sampling distribution (which happens to be also a gamma distribution with the appropriate mean and standard deviation). Monte Carlo experiment with 10,000 samples of for each n. The blue line is the exact sampling distribution Section 5.2 - The Sampling Distribution of a Sample Mean 4

Population Distribution 0.0 0.1 0.2 0.3 0.4 Lifetime 10000 samples 0.0 0.1 0.2 0.3 0.4 Lifetime Section 5.2 - The Sampling Distribution of a Sample Mean 5

Sampling distribution n = 2 0.0 0.2 0.4 0.6 Lifetime 0.0 0.2 0.4 0.6 10000 samples Lifetime Section 5.2 - The Sampling Distribution of a Sample Mean 6

Sampling distribution n = 10 0.0 0.4 0.8 1.2 Lifetime 10000 samples 0.0 0.4 0.8 1.2 Lifetime Section 5.2 - The Sampling Distribution of a Sample Mean 7

Sampling distribution n = 50 0.0 1.0 2.0 Lifetime 10000 samples 0.0 1.0 2.0 Lifetime Section 5.2 - The Sampling Distribution of a Sample Mean 8

Sampling distribution n = 100 0 1 2 3 4 Lifetime 10000 samples 0 1 2 3 4 Lifetime Section 5.2 - The Sampling Distribution of a Sample Mean 9

As the sample size n increases, the sampling distribution of approaches a normal distribution. 0.0 0.2 0.4 0.6 Sampling distribution n = 2 True Density Normal Approximation 0.0 0.4 0.8 1.2 Sampling distribution n = 10 Mean Lifetime 0.5 1.0 1.5 2.0 2.5 3.0 3.5 Mean Lifetime Sampling distribution n = 50 Sampling distribution n = 100 0.0 1.0 2.0 0 1 2 3 4 1.6 1.8 2.0 2.2 2.4 Mean Lifetime 1.6 1.8 2.0 2.2 2.4 Lifetime Section 5.2 - The Sampling Distribution of a Sample Mean 10

Central Limit Theorem Assume that 1, 2,..., n are independent and identically distributed with mean µ x and standard deviation σ x. Then for large sample sizes n, the distribution is approximately N(µ x, σ x n ) Note that the normal approximation for the distribution of ˆp is just a special case of the Central Limit Theorem (CLT). What is a large sample size? As we saw with the binomial distribution, how well the normal approximation for ˆp depended on how p, which influenced the skewness of the population distribution. This same idea holds for the general for the CLT. If the observations are normal, has precisely a normal distribution for any n, since, as we ve discussed before, sums of normals are normal. However the farther the density looks like a normal, the bigger that n needs to be for the approximation to do a good job. Section 5.2 - The Sampling Distribution of a Sample Mean 11

0.0 0.2 0.4 0.6 0.8 Exponential samples n = 1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 Exponential samples n = 2 0.0 0.2 0.4 0.6 0.8 Exponential samples n = 5 0 2 4 6 8 0 1 2 3 Exponential samples n = 10 Exponential samples n = 50 Exponential samples n = 100 0.0 0.2 0.4 0.6 0.8 1.0 1.2 0.0 0.5 1.0 1.5 2.0 2.5 0 1 2 3 4 0.5 1.0 1.5 2.0 2.5 0.6 0.8 1.0 1.2 1.4 1.6 0.6 0.8 1.0 1.2 1.4 Section 5.2 - The Sampling Distribution of a Sample Mean 12

Gamma(5,1) samples n = 1 Gamma(5,1) samples n = 2 Gamma(5,1) samples n = 5 0.00 0.05 0.10 0.15 0.00 0.05 0.10 0.15 0.20 0.25 0.0 0.1 0.2 0.3 0.4 0 5 10 15 Gamma(5,1) samples n = 10 2 4 6 8 10 12 14 Gamma(5,1) samples n = 50 2 4 6 8 10 Gamma(5,1) samples n = 100 0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.2 0.4 0.6 0.8 1.0 1.2 0.0 0.5 1.0 1.5 3 4 5 6 7 8 9 4.0 4.5 5.0 5.5 6.0 6.5 4.5 5.0 5.5 Section 5.2 - The Sampling Distribution of a Sample Mean 13

The sampling distributions based on observations from the Gamma(5,1) distribution (µ x = 5, σ x = 5) look more normal than the sampling distributions based on observations from the Exponential distribution (µ x = 1, σ x = 1). For every sample size, the distribution of is more normal for the Gamma distribution than the Exponential distribution. Section 5.2 - The Sampling Distribution of a Sample Mean 14

Smallest n 0.0 0.1 0.2 0.3 0.4 0.0 0.2 0.4 0.6 0.8 1.0 0 1 2 3 4 Largest n 0.9 1.1 1.3 1.5 0.0 0.2 0.4 0.6 0.8 1.0 Section 5.2 - The Sampling Distribution of a Sample Mean 15

The Central Limit Theorem allows us to make approximate probability statements about using a normal distribution, even though is not normally distributed. So for the light bulb example with n = 50, the P [ 1.9] can be approximated by the normal distribution. P [ 1.9] = P [ ] 2 1 1.9 2 1 50 = P [Z 0.707] 0.2398 The true probability is 0.2433. 50 0.0 1.0 2.0 0.0 1.0 2.0 Normal Approximation n = 50 Prob = 0.2398 (app) 1.4 1.6 1.8 2.0 2.2 2.4 Mean Lifetime Exact Distribution n = 50 Prob = 0.2433 1.4 1.6 1.8 2.0 2.2 2.4 Mean Lifetime Section 5.2 - The Sampling Distribution of a Sample Mean 16

The CLT can also be used to make statements about sums of independently and identically distributed random variables. Let S = 1 + 2 +... + n = n Then µ S = nµ x = nµ x σ 2 S = n 2 σ 2 x = n 2σ2 x n = σ2 xn σ S = σ x n So S is approximately N(nµ x, σ x n) distributed. Section 5.2 - The Sampling Distribution of a Sample Mean 17

Relaxing assumptions for the CLT The assumptions for the CLT can be relaxed to allow for some dependency and some differences between distributions. This is why much data is approximately normally distributed. The more general versions of the theorem say that when an effect is the sum of a large number of roughly equally weighted terms, the effect should be approximately normally distributed. For example, peoples heights are influenced a (potentially) large number of genes and by various environmental effects. Histograms of adult men s and women s heights are both well described by normal densities. Another consequence of this, is that based on a simple random samples, with fairly large sampling fractions are also approximately normally distributed. Section 5.2 - The Sampling Distribution of a Sample Mean 18

Sampling With and Without Replacement Simple Random Sampling is sometimes referred to sampling without replacement. Once a member of the population is sampled, it can t be sampled again. As discussed before, the without replacement action of the sampling introduces dependency into the observations. Another possible sampling scheme is sampling with replacement. In this case, when a member of a population is sampled, it is returned to the population and could be sampled again. This occurs if your sampling scheme is similar to repeated rolling of a dice. There is no dependency between observations in this case, as at each step, the members population that could be sampled are the same. This situation is also equivalent to drawing from an infinite population. Section 5.2 - The Sampling Distribution of a Sample Mean 19

When SRS is used, the variance of the sampling distribution needs to be adjusted for dependency induced by the sampling. The correction is based on the Finite Population Correction (FPC) f = N n N which is the fraction of the population which is not sampled. Then the variance and standard deviation of are σ 2 x = σ2 x n f σ x = σ x f n So when a bigger fraction of the population is sampled (so f is smaller), you get a smaller spread in the sampling distribution. Section 5.2 - The Sampling Distribution of a Sample Mean 20

However when n is small relative to N, this correction has little effect. For sample, if 10% of the population is sampled, so f = 0.9, the standard deviation of the sampling distribution is about 95% ( 0.9) of the standard deviation for that of with replacement sampling. If a 1% sample is taken, the correction on the standard deviation is 0.995. Except when fairly large sampling fractions occur, the FPC is usually not used. Section 5.2 - The Sampling Distribution of a Sample Mean 21