Elementary Statistics Lecture 5

Similar documents
STA258H5. Al Nosedal and Alison Weir. Winter Al Nosedal and Alison Weir STA258H5 Winter / 41

Normal Distribution. Notes. Normal Distribution. Standard Normal. Sums of Normal Random Variables. Normal. approximation of Binomial.

Statistics 6 th Edition

Lecture 23. STAT 225 Introduction to Probability Models April 4, Whitney Huang Purdue University. Normal approximation to Binomial

Chapter 3 - Lecture 5 The Binomial Probability Distribution

AMS 7 Sampling Distributions, Central limit theorem, Confidence Intervals Lecture 4

Engineering Statistics ECIV 2305

χ 2 distributions and confidence intervals for population variance

STAT Chapter 7: Central Limit Theorem

Chapter 7: Point Estimation and Sampling Distributions

MA : Introductory Probability

Chapter 5. Sampling Distributions

Binomial Random Variables. Binomial Random Variables

Unit 5: Sampling Distributions of Statistics

Unit 5: Sampling Distributions of Statistics

STA 320 Fall Thursday, Dec 5. Sampling Distribution. STA Fall

STAT 241/251 - Chapter 7: Central Limit Theorem

The Binomial Probability Distribution

Business Statistics 41000: Probability 4

Bernoulli and Binomial Distributions

4.2 Bernoulli Trials and Binomial Distributions

As you draw random samples of size n, as n increases, the sample means tend to be normally distributed.

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

AP Statistics Test 5

Chapter 9: Sampling Distributions

Review of the Topics for Midterm I

4.3 Normal distribution

Sampling and sampling distribution

ECO220Y Continuous Probability Distributions: Normal Readings: Chapter 9, section 9.10

The Central Limit Theorem. Sec. 8.2: The Random Variable. it s Distribution. it s Distribution

Chapter 7. Sampling Distributions and the Central Limit Theorem

The Bernoulli distribution

Normal distribution Approximating binomial distribution by normal 2.10 Central Limit Theorem

Distribution of the Sample Mean

Discrete Random Variables and Probability Distributions

Chapter 7. Sampling Distributions and the Central Limit Theorem

Lecture 9: Plinko Probabilities, Part III Random Variables, Expected Values and Variances

MATH 264 Problem Homework I

ECO220Y Sampling Distributions of Sample Statistics: Sample Proportion Readings: Chapter 10, section

Sampling. Marc H. Mehlman University of New Haven. Marc Mehlman (University of New Haven) Sampling 1 / 20.

Stat511 Additional Materials

MATH 3200 Exam 3 Dr. Syring

MLLunsford 1. Activity: Central Limit Theorem Theory and Computations

Statistics for Business and Economics

The binomial distribution p314

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

Part V - Chance Variability

Probability Theory and Simulation Methods. April 9th, Lecture 20: Special distributions

Statistics for Managers Using Microsoft Excel 7 th Edition

The normal distribution is a theoretical model derived mathematically and not empirically.

1 Sampling Distributions

AMS7: WEEK 4. CLASS 3

2011 Pearson Education, Inc

Consider the following examples: ex: let X = tossing a coin three times and counting the number of heads

CHAPTER 5 SAMPLING DISTRIBUTIONS

Module 4: Probability

chapter 13: Binomial Distribution Exercises (binomial)13.6, 13.12, 13.22, 13.43

IEOR 3106: Introduction to OR: Stochastic Models. Fall 2013, Professor Whitt. Class Lecture Notes: Tuesday, September 10.

5.4 Normal Approximation of the Binomial Distribution

Lecture 8 - Sampling Distributions and the CLT

Chapter 5. Continuous Random Variables and Probability Distributions. 5.1 Continuous Random Variables

Econ 6900: Statistical Problems. Instructor: Yogesh Uppal

ECON 214 Elements of Statistics for Economists 2016/2017

Theoretical Foundations

Midterm Exam III Review

15.063: Communicating with Data Summer Recitation 3 Probability II

Class 12. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Chapter 5: Statistical Inference (in General)

Figure 1: 2πσ is said to have a normal distribution with mean µ and standard deviation σ. This is also denoted

Section The Sampling Distribution of a Sample Mean

Homework Assignments

Section Distributions of Random Variables

15.063: Communicating with Data Summer Recitation 4 Probability III

CHAPTER 8 PROBABILITY DISTRIBUTIONS AND STATISTICS

Standard Normal, Inverse Normal and Sampling Distributions

Section Random Variables and Histograms

Probability. An intro for calculus students P= Figure 1: A normal integral

= 0.35 (or ˆp = We have 20 independent trials, each with probability of success (heads) equal to 0.5, so X has a B(20, 0.5) distribution.

MidTerm 1) Find the following (round off to one decimal place):

Chapter 5 Student Lecture Notes 5-1. Department of Quantitative Methods & Information Systems. Business Statistics

BIO5312 Biostatistics Lecture 5: Estimations

Stat 213: Intro to Statistics 9 Central Limit Theorem

Nonparametric Statistics Notes

Tutorial 11: Limit Theorems. Baoxiang Wang & Yihan Zhang bxwang, April 10, 2017

Section 5 3 The Mean and Standard Deviation of a Binomial Distribution!

Confidence Intervals Introduction

Lecture 3. Sampling distributions. Counts, Proportions, and sample mean.

Section 7.5 The Normal Distribution. Section 7.6 Application of the Normal Distribution

STAT Mathematical Statistics

4 Random Variables and Distributions

STAT Chapter 5: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.

Math489/889 Stochastic Processes and Advanced Mathematical Finance Homework 5

Probability Theory. Mohamed I. Riffi. Islamic University of Gaza

Lecture 6: Chapter 6

Normal distribution. We say that a random variable X follows the normal distribution if the probability density function of X is given by

Statistics, Measures of Central Tendency I

Mathematics of Randomness

STAT 111 Recitation 4

AP Statistics Ch 8 The Binomial and Geometric Distributions

. 13. The maximum error (margin of error) of the estimate for μ (based on known σ) is:

Transcription:

Elementary Statistics Lecture 5 Sampling Distributions Chong Ma Department of Statistics University of South Carolina Chong Ma (Statistics, USC) STAT 201 Elementary Statistics 1 / 24

Outline 1 Introduction 2 Sampling Distribution of Sample Statistic 3 Examples Chong Ma (Statistics, USC) STAT 201 Elementary Statistics 2 / 24

Recall Parameter: A numerical summary of the population, such as a population proportion p for a categorical variable fixed but usually unknown. Statistic: A numerical summary of a sample taken from the population, such as the sample mean, sample proportion, sample median and so on. Sampling Distribution The sampling distribution of a statistic is the probability distribution that specifies probabilities for the possible values the statistic can take. Chong Ma (Statistics, USC) STAT 201 Elementary Statistics 3 / 24

Summary of jargons in terms of distributions Summary Population distribution: The distribution from which we take the sample Data distribution: The distribution of the data obtained from the sample. The larger the sample, the more closely the data distribution resembles the population distribution. Sampling distribution: The distribution of a statistic such as a sample proportion or a sample mean. Chong Ma (Statistics, USC) STAT 201 Elementary Statistics 4 / 24

Outline 1 Introduction 2 Sampling Distribution of Sample Statistic 3 Examples Chong Ma (Statistics, USC) STAT 201 Elementary Statistics 5 / 24

Central Limit Theorem(CLT) Given certain conditions, the arithmetic mean of a sufficiently large number of independent random variables, each with a well-defined(finite) expected value(µ) and finite variance(σ 2 ), will be approximately normally distributed, regardless of the underlying distribution. Mathematically, it can be rewritten as follows. CLT Suppose {X 1, X 2,..., X n } is a sequence of i.i.d random variables with E[X i ] = µ and Var(X i ) = σ 2 <. Then as n approaches infinity, the random variable distribution N(0, 1). n( Xn µ) σ converge in distribution to the standard normal In other words, X n aprox N(µ, σ n ) Chong Ma (Statistics, USC) STAT 201 Elementary Statistics 6 / 24

Sampling distribution of sample proportion ˆp For a random sample of a size n from a population with proportion p of outcomes in a particular category, the sampling distribution of the sample proportion in that category approximately follows a normal distribution ˆp aprox p(1 p) N(p, ) n In practice, the above statement holds when the assumptions of np 15, n(1 p) 15 are satisfied. Chong Ma (Statistics, USC) STAT 201 Elementary Statistics 7 / 24

Sampling distribution of sample mean x n For a random sample of size n from a population having mean µ and standard deviation σ, then as the sample size n increases, the sampling distribution of the sample mean x n approaches an approximately normal distribution as follows. aprox x n N(µ, σ n ) In practice, the above statement holds when n 30. Chong Ma (Statistics, USC) STAT 201 Elementary Statistics 8 / 24

Sampling distribution Figure 1: Five population distributions and the corresponding sampling distributions of x n. Regardless of the shape of the population distribution, the sampling distribution becomes more bell shaped as the sample size n increases. Chong Ma (Statistics, USC) STAT 201 Elementary Statistics 9 / 24

Outline 1 Introduction 2 Sampling Distribution of Sample Statistic 3 Examples Chong Ma (Statistics, USC) STAT 201 Elementary Statistics 10 / 24

Defective Chips A supplier of electronic chips for tablets claims that only 4% of his chips are defective. A manufacture tests 500 randomly selected chips from a large shipment from the supplier for potential defects. (a) Find the mean and the standard deviation for the distribution of the sample proportion of defective chips in the sample of 500. (b) Is it reasonable to assume a normal shape for the sampling distribution? Explain. (c) The manufacture will return the entire shipment if he finds more than 5% of the 500 sampled chips to be defective. Find the probability that the shipment will be returned. Chong Ma (Statistics, USC) STAT 201 Elementary Statistics 11 / 24

Defective Chips A supplier of electronic chips for tablets claims that only 4% of his chips are defective. A manufacture tests 500 randomly selected chips from a large shipment from the supplier for potential defects. (a) Find the mean and the standard deviation for the distribution of the sample proportion of defective chips in the sample of 500. Solution The population of defective chips of the supplier is p = 0.04. The sample size is n = 500. standard deviation mean p = 0.04 p(1 p) n = 0.04 0.96 500 = 0.0088 Chong Ma (Statistics, USC) STAT 201 Elementary Statistics 12 / 24

Defective Chips A supplier of electronic chips for tablets claims that only 4% of his chips are defective. A manufacture tests 500 randomly selected chips from a large shipment from the supplier for potential defects. (b) Is it reasonable to assume a normal shape for the sampling distribution? Explain. Solution Yes. Since np = 500 0.04 = 20 15, n(1 p) = 500 0.96 = 480 15, the central limit theorem guarantees the sampling distribution of the sample proportion of defective chips is approximately normal. Chong Ma (Statistics, USC) STAT 201 Elementary Statistics 13 / 24

Defective Chips A supplier of electronic chips for tablets claims that only 4% of his chips are defective. A manufacture tests 500 randomly selected chips from a large shipment from the supplier for potential defects. (c) The manufacture will return the entire shipment if he finds more than 5% of the 500 sampled chips to be defective. Find the probability that the shipment will be returned. Solution Note that Then ˆp aprox p(1 p) N(p, ) = N(0.04, 0.0088) n 0.05 0.04 P(ˆp 0.05) = P(Z ) 0.0088 = P(Z 1.14) = 0.127 Chong Ma (Statistics, USC) STAT 201 Elementary Statistics 14 / 24

Average income A large corporation employs 27,251 individuals. The average income in 2008 for all employees was $74,550 with a standard deviation of $19,872. You are interested in comparing the incomes of today s employee s with those of 2008. A random sample of 100 employees of the corporation yields x = $75, 207 and s = $18, 901. (a) Describe the center and variability of the population distribution. What shape does it probably have? (b) Describe the center and variability of the data distribution. What shape does it probably have? (c) Describe the center and variability of the sampling distribution of the sample mean for n = 100. What shape does it have? (d) Explain why it would not be unusual to observe an individual who earns more than $100,000, but it would be highly unusual to observe a sample mean income of more than $100,000 for a random sample size of 100 people? Chong Ma (Statistics, USC) STAT 201 Elementary Statistics 15 / 24

Average income A large corporation employs 27,251 individuals. The average income in 2008 for all employees was $74,550 with a standard deviation of $19,872. You are interested in comparing the incomes of today s employee s with those of 2008. A random sample of 100 employees of the corporation yields x = $75, 207 and s = $18, 901. (a) Describe the center and variability of the population distribution. What shape does it probably have? Solution The mean and standard deviation for the population is mean µ = 74, 550 standard deviation σ = 19, 872 The shape of the population distribution of employee s income is probably highly right-skewed. Chong Ma (Statistics, USC) STAT 201 Elementary Statistics 16 / 24

Average income A large corporation employs 27,251 individuals. The average income in 2008 for all employees was $74,550 with a standard deviation of $19,872. You are interested in comparing the incomes of today s employee s with those of 2008. A random sample of 100 employees of the corporation yields x = $75, 207 and s = $18, 901. (b) Describe the center and variability of the data distribution. What shape does it probably have? Solution The mean and standard deviation for the data population is mean x = 75, 207 standard deviation s = 18901 Because the data distribution resembles the population distribution, thus the shape of the data distribution is probably right-skewed as well. Chong Ma (Statistics, USC) STAT 201 Elementary Statistics 17 / 24

Average income A large corporation employs 27,251 individuals. The average income in 2008 for all employees was $74,550 with a standard deviation of $19,872. You are interested in comparing the incomes of today s employee s with those of 2008. A random sample of 100 employees of the corporation yields x = $75, 207 and s = $18, 901. (c) Describe the center and variability of the sampling distribution of the sample mean for n = 100. What shape does it have? Solution The mean and standard deviation for the data population is mean µ xn = µ = 74, 550 standard deviation σ xn = σ 100 = 1, 987 The central limit theorem guarantees that the sampling distribution of the sample mean of employee s income for n = 100 is approximately normal since n = 100 30. Chong Ma (Statistics, USC) STAT 201 Elementary Statistics 18 / 24

Average income A large corporation employs 27,251 individuals. The average income in 2008 for all employees was $74,550 with a standard deviation of $19,872. You are interested in comparing the incomes of today s employee s with those of 2008. A random sample of 100 employees of the corporation yields x = $75, 207 and s = $18, 901. (d) Explain why it would not be unusual to observe an individual who earns more than $100,000, but it would be highly unusual to observe a sample mean income of more than $100,000 for a random sample size of 100 people? Solution Note that X N(µ, σ) = N(74, 550, 19, 872) aprox σ X n N(µ, ) = N(74, 550, 1, 987) n Chong Ma (Statistics, USC) STAT 201 Elementary Statistics 19 / 24

Average income A large corporation employs 27,251 individuals. The average income in 2008 for all employees was $74,550 with a standard deviation of $19,872. You are interested in comparing the incomes of today s employee s with those of 2008. A random sample of 100 employees of the corporation yields x = $75, 207 and s = $18, 901. (d) Explain why it would not be unusual to observe an individual who earns more than $100,000, but it would be highly unusual to observe a sample mean income of more than $100,000 for a random sample size of 100 people? Solution Note that 100, 000 74, 550 P(X 100, 000) = P(X ) = P(Z 1.28) = 0.1 19, 872 100, 000 74, 550 P( X n 100, 000) = P( X n ) = P(Z 12.8) = 0 1, 987 Chong Ma (Statistics, USC) STAT 201 Elementary Statistics 20 / 24

Coin-toss distribution For a single coin toss of a balanced coin, let x = 1 for a head and x = 0 for a tail. Say a coin is flipped 30 times. Let Y denote the number of heads occurring in the 30 flips. (a) Find the sampling distribution of the sample proportion of head. (b) Find the probability of observing more than 10 heads for the 30 flips of a balanced coin. Chong Ma (Statistics, USC) STAT 201 Elementary Statistics 21 / 24

Coin-toss distribution For a single coin toss of a balanced coin, let x = 1 for a head and x = 0 for a tail. Say a coin is flipped 30 times. Let Y denote the number of heads occurring in the 30 flips. (a) Find the sampling distribution of the sample proportion of head. Solution Note p = 0.5, n = 30, then ˆp aprox N(p, p(1 p)/n) = N(0.5, 0.09) The CLT guarantees the sampling distribution of ˆp is approximately normal since np = 15, n(1 p) = 15. Chong Ma (Statistics, USC) STAT 201 Elementary Statistics 22 / 24

Coin-toss distribution For a single coin toss of a balanced coin, let x = 1 for a head and x = 0 for a tail. Say a coin is flipped 30 times. Let Y denote the number of heads occurring in the 30 flips. (b) Find the probability of observing more than 10 heads for the 30 flips of a balanced coin. Solution Note that Y Binomial(n, p) = Binomial(30, 0.5), then P(Y > 10) = 1 P(Y 10) = 1 {P(Y = 0) + P(Y = 1) + + P(Y = 10)}.................. = 0.95 It s tedious for using the binomial distribution to calculate this probability. An easier way is to use the sampling distribution of the sample proportion ˆp. Chong Ma (Statistics, USC) STAT 201 Elementary Statistics 23 / 24

Coin-toss distribution For a single coin toss of a balanced coin, let x = 1 for a head and x = 0 for a tail. Say a coin is flipped 30 times. Let Y denote the number of heads occurring in the 30 flips. (b) Find the probability of observing more than 10 heads for the 30 flips of a balanced coin. Solution The question is equivalent to finding the probability of sample proportion more than 0.3. Note ˆp aprox N(p, p(1 p)/n) = N(0.5, 0.09) Thus 0.3 0.5 P(ˆp > 0.3) = P(ˆp > ) 0.09 = P(Z > 2.22) = 0.986 Chong Ma (Statistics, USC) STAT 201 Elementary Statistics 24 / 24