STAT Chapter 6: Sampling Distributions

Similar documents
Unit 5: Sampling Distributions of Statistics

Unit 5: Sampling Distributions of Statistics

Chapter 7 presents the beginning of inferential statistics. The two major activities of inferential statistics are

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS

Determining Sample Size. Slide 1 ˆ ˆ. p q n E = z α / 2. (solve for n by algebra) n = E 2

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Statistical Intervals. Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Statistical Intervals (One sample) (Chs )

Normal Probability Distributions

As you draw random samples of size n, as n increases, the sample means tend to be normally distributed.

STAT 509: Statistics for Engineers Dr. Dewei Wang. Copyright 2014 John Wiley & Sons, Inc. All rights reserved.

Statistics for Business and Economics

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

Chapter 7 Sampling Distributions and Point Estimation of Parameters

Chapter 7 Study Guide: The Central Limit Theorem

Chapter 7: Point Estimation and Sampling Distributions

Probability is the tool used for anticipating what the distribution of data should look like under a given model.

UNIVERSITY OF VICTORIA Midterm June 2014 Solutions

χ 2 distributions and confidence intervals for population variance

VARIABILITY: Range Variance Standard Deviation

1. Variability in estimates and CLT

Interval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION

Statistics 13 Elementary Statistics

AMS7: WEEK 4. CLASS 3

Chapter 15: Sampling distributions

σ 2 : ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 7.4-1

Confidence Intervals. σ unknown, small samples The t-statistic /22

Chapter 7 - Lecture 1 General concepts and criteria

The Range, the Inter Quartile Range (or IQR), and the Standard Deviation (which we usually denote by a lower case s).

The Central Limit Theorem. Sec. 8.2: The Random Variable. it s Distribution. it s Distribution

Module 3: Sampling Distributions and the CLT Statistics (OA3102)

Chapter 7. Inferences about Population Variances

5.3 Statistics and Their Distributions

1. Covariance between two variables X and Y is denoted by Cov(X, Y) and defined by. Cov(X, Y ) = E(X E(X))(Y E(Y ))

MgtOp S 215 Chapter 8 Dr. Ahn

8.1 Estimation of the Mean and Proportion

SAMPLING DISTRIBUTIONS. Chapter 7

Review of the Topics for Midterm I

STAT Chapter 5: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.

Review of key points about estimators

Chapter 4 Variability

1 Sampling Distributions

Sampling Distributions

Previously, when making inferences about the population mean, μ, we were assuming the following simple conditions:

μ: ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics

Statistics 431 Spring 2007 P. Shaman. Preliminaries

Part V - Chance Variability

CHAPTER 2 Describing Data: Numerical

Chapter 7. Sampling Distributions

6 Central Limit Theorem. (Chs 6.4, 6.5)

LESSON 7 INTERVAL ESTIMATION SAMIE L.S. LY

Statistics, Their Distributions, and the Central Limit Theorem

Key Objectives. Module 2: The Logic of Statistical Inference. Z-scores. SGSB Workshop: Using Statistical Data to Make Decisions

Sampling. Marc H. Mehlman University of New Haven. Marc Mehlman (University of New Haven) Sampling 1 / 20.

STAT Chapter 5: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.

Numerical Descriptive Measures. Measures of Center: Mean and Median

STAT 241/251 - Chapter 7: Central Limit Theorem

1 Inferential Statistic

Point Estimation. Principle of Unbiased Estimation. When choosing among several different estimators of θ, select one that is unbiased.

Business Statistics 41000: Probability 3

Chapter 7: Sampling Distributions Chapter 7: Sampling Distributions

Distribution. Lecture 34 Section Fri, Oct 31, Hampden-Sydney College. Student s t Distribution. Robb T. Koether.

Applied Statistics I

If the distribution of a random variable x is approximately normal, then

Chapter 8 Estimation

Lecture 9 - Sampling Distributions and the CLT

Confidence Intervals for the Mean. When σ is known

1 Introduction 1. 3 Confidence interval for proportion p 6

Contents. 1 Introduction. Math 321 Chapter 5 Confidence Intervals. 1 Introduction 1

Review of key points about estimators

The Two-Sample Independent Sample t Test

Sampling and sampling distribution

Module 4: Probability

STAT Chapter 7: Confidence Intervals

STA 320 Fall Thursday, Dec 5. Sampling Distribution. STA Fall

Sampling Distributions Chapter 18

Chapter Seven: Confidence Intervals and Sample Size

STAT Chapter 7: Central Limit Theorem

Statistics 511 Supplemental Materials

Quality Digest Daily, March 2, 2015 Manuscript 279. Probability Limits. A long standing controversy. Donald J. Wheeler

1. Statistical problems - a) Distribution is known. b) Distribution is unknown.

Name PID Section # (enrolled)

Sampling Distributions

Exam 2 Spring 2015 Statistics for Applications 4/9/2015

Hypothesis Tests: One Sample Mean Cal State Northridge Ψ320 Andrew Ainsworth PhD

STAT 113 Variability

*****CENTRAL LIMIT THEOREM (CLT)*****

Math 227 Elementary Statistics. Bluman 5 th edition

10. Lessons From Capital Market History

Sampling Distribution

Elementary Statistics Lecture 5

Chapter 3 - Lecture 5 The Binomial Probability Distribution

STA Module 3B Discrete Random Variables

Midterm Exam III Review

Discrete Random Variables

Time Observations Time Period, t

Math 140 Introductory Statistics

Business Statistics 41000: Probability 4

Transcription:

STAT 515 -- Chapter 6: Sampling Distributions Definition: Parameter = a number that characterizes a population (example: population mean ) it s typically unknown. Statistic = a number that characterizes a sample (example: sample mean X _ ) we can calculate it from our sample data. We use the sample mean X _ to estimate the population mean. Suppose we take a sample and calculate X _. Will X _ equal? Will X _ be close to? Suppose we take another sample and get another X _. Will it be same as first X _? Will it be close to first X _? What if we took many repeated samples (of the same size) from the same population, and each time, calculated the sample mean? _ What would that set of X values look like? The sampling distribution of a statistic is the distribution of values of the statistic in all possible samples (of the same size) from the same population.

Consider the sampling distribution of the sample mean X _ when we take samples of size n from a population with mean and variance 2. Picture: The sampling distribution of X _ has mean and standard deviation / n. Notation: Point Estimator: A statistic which is a single number meant to estimate a parameter. It would be nice if the average value of the estimator (over repeated sampling) equaled the target parameter. An estimator is called unbiased if the mean of its sampling distribution is equal to the parameter being estimated.

Examples: Another nice property of an estimator: we want the spread of its sampling distribution to be as small as possible. The standard deviation of a statistic s sampling distribution is called the standard error of the statistic. The standard error of the sample mean X _ is / n. Note: As the sample size gets larger, the spread of the sampling distribution gets smaller. When the sample size is large, the sample mean varies less across samples. Evaluating an estimator: (1) Is it unbiased? (2) Does it have a small standard error?

Central Limit Theorem We have determined the center and the spread of the sampling distribution of X _. What is the shape of its sampling distribution? Case I: If the distribution of the original data is _ normal, the sampling distribution of X is normal. (This is true no matter what the sample size is.) Case II: Central Limit Theorem: If we take a random sample (of size n) from any population with mean and standard deviation, the sampling distribution of X _ is approximately normal, if the sample size is large. How large does n have to be? Our rule of thumb: If n 30, we can apply the CLT result. Pictures: As n gets larger, the closer the sampling distribution looks to a normal distribution.

Why is the CLT important? Because when X _ is (approximately) normally distributed, we can answer probability questions about the sample mean. Standardizing values of X _ : If X _ is normal with mean and standard deviation, then X Z / n has a standard normal distribution. / n Example: Suppose we re studying the failure time (at high stress) of a certain engine part. The failure times have a mean of 1.4 hours and a standard deviation of 0.9 hours. If our sample size is 40 engine parts, then what is the sampling distribution of the sample mean?

What is the probability that the sample mean will be greater than 1.5? Example: Suppose lawyers salaries have a mean of $90,000 and a standard deviation of $30,000 (highly skewed). Given a sample of lawyers, can we find the probability the sample mean is less than $100,000 if n = 5? If n = 30?

Other Sampling Distributions In practice, the population standard deviation is typically unknown. We estimate with s. But the quantity X s / normal distribution. n no longer has a standard Its sampling distribution is as follows: If the data come from a normal population, then the X statistic T s / n has a t-distribution ( Student s t ) with n 1 degrees of freedom (the parameter of the t-distribution). The t-distribution resembles the standard normal (symmetric, mound-shaped, centered at zero) but it is more spread out. The fewer the degrees of freedom, the more spread out the t-distribution is. As the d.f. increase, the t-distribution gets closer to the standard normal. Picture:

Table III gives values of the t-distribution with specific areas to the right of these values: Verify: In t-distribution with 3 d.f., area to the right of is.025. (Notation: For 3 d.f., t.025 = ) In t with 14 d.f., area to the right of is.05. In t with 25 d.f., area to the right of is.999.

The 2 (Chi-square) Distribution Suppose our sample (of size n) comes from a normal population with mean and standard deviation. Then freedom. ( n 1) s 2 2 has a 2 distribution with n 1 degrees of The 2 distribution takes on positive values. It is skewed to the right. It is less skewed for higher degrees of freedom. The mean of a 2 distribution with n 1 degrees of freedom is n 1 and the variance is 2(n 1). Fact: If we add the squares of n independent standard normal r.v. s, the resulting sum has a 2 n distribution. 2 ( n 1) s Note that 2 = We sacrifice one d.f. by estimating with X _, so it is 2 n-1.

Table IV gives values of a 2 r.v. with specific areas to the right of those values. Examples: For 2 with 6 d.f., area to the right of is.90. For 2 with 6 d.f., area to the right of is.05. For 2 with 80 d.f., area to the right of is.10.

2 n 1 The quantity 2 /( n 1) 1 n 1 2 The F Distribution /( n 1 2 1) where the two 2 r.v. s are independent, has an F-distribution with n 1 1 numerator degrees of freedom and n 2 1 denominator degrees of freedom. So, if we have samples (of sizes n 1 and n 2 ) from two normal populations, note: has an F-distribution with (n 1 1, n 2 1) d.f.

Table V gives values of F r.v. with area.10 to the right. Table VI gives values of F r.v. with area.05 to the right. Table VII gives values of F r.v. with area.025 to the right. Table VIII gives values of F r.v. with area.01 to the right. Verify: For F with (3, 9) d.f., 2.81 has area 0.10 to right. For F with (15, 13) d.f., 3.82 has area 0.01 to right. These sampling distributions will be important in many inferential procedures we will learn.