STA 320 Fall Thursday, Dec 5. Sampling Distribution. STA Fall

Similar documents
Central Limit Theorem

Elementary Statistics Lecture 5

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS

Lecture 9. Probability Distributions

Chapter 7 Study Guide: The Central Limit Theorem

NORMAL RANDOM VARIABLES (Normal or gaussian distribution)

Lecture 9. Probability Distributions. Outline. Outline

7 THE CENTRAL LIMIT THEOREM

AMS 7 Sampling Distributions, Central limit theorem, Confidence Intervals Lecture 4

ECON 214 Elements of Statistics for Economists 2016/2017

Chapter 7 Sampling Distributions and Point Estimation of Parameters

Examples of continuous probability distributions: The normal and standard normal

The normal distribution is a theoretical model derived mathematically and not empirically.

Math 227 Elementary Statistics. Bluman 5 th edition

Review of commonly missed questions on the online quiz. Lecture 7: Random variables] Expected value and standard deviation. Let s bet...

Statistics, Their Distributions, and the Central Limit Theorem

2011 Pearson Education, Inc

MidTerm 1) Find the following (round off to one decimal place):

1. Variability in estimates and CLT

1 Sampling Distributions

The Normal Probability Distribution

As you draw random samples of size n, as n increases, the sample means tend to be normally distributed.

MAKING SENSE OF DATA Essentials series

Chapter 7: Sampling Distributions Chapter 7: Sampling Distributions

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Distribution of the Sample Mean

Chapter 6. The Normal Probability Distributions

Statistics and Probability

ECON 214 Elements of Statistics for Economists

Activity #17b: Central Limit Theorem #2. 1) Explain the Central Limit Theorem in your own words.

Chapter 7. Sampling Distributions

MLLunsford 1. Activity: Central Limit Theorem Theory and Computations

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

Numerical Descriptive Measures. Measures of Center: Mean and Median

Chapter 5. Sampling Distributions

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

LESSON 7 INTERVAL ESTIMATION SAMIE L.S. LY

Some Characteristics of Data

Chapter 4. The Normal Distribution

4.3 Normal distribution

STAT 201 Chapter 6. Distribution

Chapter 8. Variables. Copyright 2004 Brooks/Cole, a division of Thomson Learning, Inc.

CH 5 Normal Probability Distributions Properties of the Normal Distribution

Chapter 9. Sampling Distributions. A sampling distribution is created by, as the name suggests, sampling.

STAT:2010 Statistical Methods and Computing. Using density curves to describe the distribution of values of a quantitative

Key Objectives. Module 2: The Logic of Statistical Inference. Z-scores. SGSB Workshop: Using Statistical Data to Make Decisions

Sampling Distribution of and Simulation Methods. Ontario Public Sector Salaries. Strange Sample? Lecture 11. Reading: Sections

Normal distribution Approximating binomial distribution by normal 2.10 Central Limit Theorem

The Central Limit Theorem. Sec. 8.2: The Random Variable. it s Distribution. it s Distribution

Chapter 3. Density Curves. Density Curves. Basic Practice of Statistics - 3rd Edition. Chapter 3 1. The Normal Distributions

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Chapter Seven. The Normal Distribution

Section 6.5. The Central Limit Theorem

Normal Probability Distributions

Prob and Stats, Nov 7

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

AMS7: WEEK 4. CLASS 3

Part V - Chance Variability

Chapter 8 Statistical Intervals for a Single Sample

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need.

ECO220Y Continuous Probability Distributions: Normal Readings: Chapter 9, section 9.10

Sampling and sampling distribution

One sample z-test and t-test

Math 124: Module 8 (Normal Distribution) Normally Distributed Random Variables. Solving Normal Problems with Technology

Probability & Sampling The Practice of Statistics 4e Mostly Chpts 5 7

Module 4: Probability

Quantitative Methods for Economics, Finance and Management (A86050 F86050)

Sampling Distributions For Counts and Proportions

Lecture 12. Some Useful Continuous Distributions. The most important continuous probability distribution in entire field of statistics.

Chapter 6. y y. Standardizing with z-scores. Standardizing with z-scores (cont.)

Chapter 9: Sampling Distributions

BIOL The Normal Distribution and the Central Limit Theorem

Business Statistics 41000: Probability 4

IOP 201-Q (Industrial Psychological Research) Tutorial 5

SAMPLING DISTRIBUTIONS. Chapter 7

STAT Chapter 5: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.

Example - Let X be the number of boys in a 4 child family. Find the probability distribution table:

8.1 Estimation of the Mean and Proportion

Statistics for Managers Using Microsoft Excel/SPSS Chapter 6 The Normal Distribution And Other Continuous Distributions

3.3-Measures of Variation

Determining Sample Size. Slide 1 ˆ ˆ. p q n E = z α / 2. (solve for n by algebra) n = E 2

Chapter 6: The Normal Distribution

Homework: Due Wed, Nov 3 rd Chapter 8, # 48a, 55c and 56 (count as 1), 67a

Sampling Distributions

Consider the following examples: ex: let X = tossing a coin three times and counting the number of heads

UNIVERSITY OF VICTORIA Midterm June 2014 Solutions

6 Central Limit Theorem. (Chs 6.4, 6.5)

Chapter 6: The Normal Distribution

STAT Chapter 6: Sampling Distributions

Continuous Probability Distributions & Normal Distribution

Sec$on 6.1: Discrete and Con.nuous Random Variables. Tuesday, November 14 th, 2017

Section Random Variables and Histograms

Data Analysis and Statistical Methods Statistics 651

Probability Basics. Part 1: What is Probability? INFO-1301, Quantitative Reasoning 1 University of Colorado Boulder. March 1, 2017 Prof.

Density curves. (James Madison University) February 4, / 20

Normal distribution. We say that a random variable X follows the normal distribution if the probability density function of X is given by

PROBABILITY DISTRIBUTIONS. Chapter 6

Example - Let X be the number of boys in a 4 child family. Find the probability distribution table:

Using the Central Limit Theorem It is important for you to understand when to use the CLT. If you are being asked to find the probability of the

Transcription:

STA 320 Fall 2013 Thursday, Dec 5 Sampling Distribution STA 320 - Fall 2013-1

Review We cannot tell what will happen in any given individual sample (just as we can not predict a single coin flip in advance). We CAN tell a lot about the pattern of variation amongst many samples (just as we can predict that if you flip the coin a lot, you will get about 50% heads and 50% tails). In our doctor example, we found that the pattern of variation of the sample proportions, called the sampling distribution, followed a normal distribution. http://www.amstat.org/publications/jse/v6n3/ applets/clt.html STA 320 - Fall 2013-2

Sampling Distributions for Proportions Suppose we have a population of size N consisting of M successes and N-M failures. We sample a group of n people at random. Suppose further that n/n is small (rule of thumb: less than 5%) n is not small (rule of thumb: n>25) M/N=p is not too close to 0 or 1 (rule of thumb: 0.05<p<0.95). Then the sampling distribution of the sample proportion is normal with mean M/N=p (the population proportion) and standard deviation sqrt(p(1-p)/n). Why this is true is beyond the scope of this course. It is because of a beautiful mathematical theorem: Central Limit Theorem. STA 320 - Fall 2013-3

In Practice Unfortunately, we typically only get to draw one sample. How do you know if you got one of the samples that fall in the middle 95% (closer to the true proportion) as opposed to the outer 5% (farther from the true proportion)? Answer really, you don t. But it s more likely you re in the 95% group than the 5% group. Want to be more sure? Construct a 99% group instead of a 1% group, then the odds are even more in your favor. STA 320 - Fall 2013-4

What Matters, What Doesn t The center of the sampling distribution is the true proportion p. On average, p-hat is centered around p. The sample size appears in the standard deviation sqrt(p (1-p)/n). The bigger the sample size, the smaller the standard deviation of p-hat. In other words, the closer p-hat tends to be to p. The population size does NOT matter. As long as you are sampling less than 1 in 20 people, it does not matter whether it is 1 of every 2000 or 1 of every 2 million. STA 320 - Fall 2013-5

Population Size N=10000, 35% Successes Comparing n=300 to n=100 N=10000 in population n=300 in sample N(0.35,sqrt(0.35*0.65/300)=0.0275) N=10000 in population n=100 in sample N(0.35,sqrt(0.35*0.65/100)=0.0478) STA 320 - Fall 2013-6

Sample Size n=300, 35% Successes Comparing N=10000 to N=100000 N=10,000 in population n=300 in sample N(0.35,sqrt(0.35*0.65/300)=0.0275) N=100,000 in population n=300 in sample N(0.35,sqrt(0.35*0.65/300)=0.0275) Here 10000 in population n=100 in sample STA 320 - Fall 2013-7

Summary: Sampling Distribution Popula'on with propor'on p of successes If you repeatedly take random samples and calculate the sample proportion each time, the distribution of the sample proportions follows a pattern This pattern is called the sampling distribution of p-hat STA 320 - Fall 2013-8

Properties of the Sampling Distribution Expected Value of the s: p. Standard deviation of the s: also called the standard error of Central Limit Theorem: As the sample size increases, the distribution of the s gets closer and closer to the normal. STA 320 - Fall 2013-9

Sampling Distribution of Means Popula'on with mean mu and standard devia'on sigma STA 320 - Fall 2013 - If you repeatedly take random samples and calculate the sample mean each time, the distribution of the sample means follows a pattern This pattern is the sampling distribution of X- bar 10

Properties of the Sampling Distribution Expected Value of the s: µ. Standard deviation of the s: also called the standard error of For N/n<20, use a finite population correction factor for the standard deviation: Central Limit Theorem: As the sample size increases, the distribution of the s gets closer and closer to a normal curve. STA 320 - Fall 2013-11

Summary: Sampling Distribution We cannot tell what will happen in any given individual sample. We CAN tell a lot about the pattern of variation amongst many samples. Graph of sample proportions for all possible samples for selecting 500 people from a population with 25000 successes and 75000 failures, overlaid with a perfect normal curve. STA 320 - Fall 2013-12

Summary: Population, Sample, and Sampling Distribution Population Total set of all subjects of interest Can be described by (unknown) parameters Want to make inference about its parameters Sample Data that we observe We describe it, using descriptive statistics For large n, the sample resembles the population Sampling Distribution Probability distribution of a statistic (for example, sample mean, sample proportion) Used to determine the probability that a statistic falls within a certain distance of the population parameter For large n, the sampling distribution (of sample mean, sample proportion) looks more and more like a normal distribution STA 320 - Fall 2013-13

Summary: Central Limit Theorem The most important theorem in statistics For random sampling, as the sample size n grows, the sampling distribution of the sample mean (and of the sample proportion p-hat) approaches a normal distribution Amazing: This is the case even if the population distribution is discrete or highly skewed Online applet 1 Online applet 2 The Central Limit Theorem can be proved mathematically (STA 524) STA 320 - Fall 2013-14

Central Limit Theorem Usually, the sampling distribution of is approximately normal for sample sizes of at least n=25 (rule of thumb) In addition, we know that the parameters of the sampling distribution are mean=mu and standard error= For example: If the sample size is at least n=25, then with 95% probability, the sample mean falls between STA 320 - Fall 2013-15

Calculating z-scores 1. z-score for an individual observation You need to know Y, mu, and sigma to calculate z 2. z-score for a sample mean You need to know Y-bar, mu, sigma, and n to calculate z 3. z-score for a sample proportion You need to know p-hat, p, and n to calculate z STA 320 - Fall 2013-16

Population Parameters and Population parameter p proportion of population with a certain characteristic µ mean value of a population variable Value Unknown Unknown Sample Statistics Sample statistic used to estimate The value of a population parameter is a fixed number, it is NOT random; its value is not known. The value of a sample statistic is calculated from sample data The value of a sample statistic will vary from sample to sample (sampling distributions)

More Example

Shape of population dist. not known Graphically

More Example (cont.)

More Example 2 The probability distribution of 6-month incomes of account executives has mean $20,000 and standard deviation $5,000. a) A single executive s income is $20,000. Can it be said that this executive s income exceeds 50% of all account executive incomes?

More Example 2 The probability distribution of 6-month incomes of account executives has mean $20,000 and standard deviation $5,000. a) A single executive s income is $20,000. Can it be said that this executive s income exceeds 50% of all account executive incomes? ANSWER No. P(X<$20,000)=? No information given about shape of distribution of X; we do not know the median of 6-mo incomes.

More Example 2(cont.) b) n=64 account executives are randomly selected. What is the probability that the sample mean exceeds $20,500?

More Example 2(cont.) b) n=64 account executives are randomly selected. What is the probability that the sample mean exceeds $20,500?

More Example 3 A sample of size n=16 is drawn from a normally distributed population with mean E(x)=20 and SD(x)=8.

More Example 3 A sample of size n=16 is drawn from a normally distributed population with mean E(x)=20 and SD(x)=8.

More Example 3 (cont.) c. Do we need the Central Limit Theorem to solve part a or part b?

More Example 3 (cont.) c. Do we need the Central Limit Theorem to solve part a or part b? NO. We are given that the population is normal, so the sampling distribution of the mean will also be normal for any sample size n. The CLT is not needed.

More Example 4 Battery life X~N(20, 10). Guarantee: avg. battery life in a case of 24 exceeds 16 hrs. Find the probability that a randomly selected case meets the guarantee.

More Example 4 Battery life X~N(20, 10). Guarantee: avg. battery life in a case of 24 exceeds 16 hrs. Find the probability that a randomly selected case meets the guarantee.

More Example 5 Cans of salmon are supposed to have a net weight of 6 oz. The canner says that the net weight is a random variable with mean µ=6.05 oz. and stand. dev. σ=.18 oz. Suppose you take a random sample of 36 cans and calculate the sample mean weight to be 5.97 oz. Find the probability that the mean weight of the sample is less than or equal to 5.97 oz.

Population X: amount of salmon in a can E(x)=6.05 oz, SD(x) =.18 oz X sampling dist: E(x)=6.05 SD(x)=.18/6=.03 By the CLT, X sampling dist is approx. normal P(X 5.97) = P(z [5.97-6.05]/.03) =P(z -.08/.03)=P(z -2.67)=.0038 How could you use this answer?

Suppose you work for a consumer watchdog group If you sampled the weights of 36 cans and obtained a sample mean x 5.97 oz., what would you think? Since P( x 5.97) =.0038, either you observed a rare event (recall: 5.97 oz is 2.67 stand. dev. below the mean) and the mean fill E(x) is in fact 6.05 oz. (the value claimed by the canner) the true mean fill is less than 6.05 oz., (the canner is lying ).

More Example 6 X: weekly income. E(x)=600, SD(x) = 100 n=25; X sampling dist: E(x)=600 SD(x) =100/5=20 P(X 550)=P(z [550-600]/20) =P(z -50/20)=P(z -2.50) =.0062 Suspicious of claim that average is $600; evidence is that average income is less.

More Example 7 12% of students at UK are left-handed. What is the probability that in a sample of 50 students, the sample proportion that are lefthanded is less than 11%?

More Example 7 12% of students at UK are left-handed. What is the probability that in a sample of 50 students, the sample proportion that are lefthanded is less than 11%?

Quiz I For women aged 18-24, systolic blood pressures are normally distributed with mean 114.8 [mm Hg] and standard deviation 13.1 [mm Hg] Hypertension is commonly defined as a value above 140. If a woman between 18 and 24 is randomly selected, find the probability that her systolic blood pressure is above 140 For a sample of 4 women, find the probability that their mean systolic blood pressure is above 140 Note that for this problem, we don t actually need the central limit theorem because the variable blood pressure has a normal distribution we don t need to rely on averages. STA 320 - Fall 2013-37

Quiz II Analysts think that the length of time people work at a job has a mean of 6.1 years and a standard deviation of 4.3 years. Do you expect this distribution to be left-skewed or right-skewed or symmetric? Why? Can you calculate the probability that a randomly chosen person spends less than 5 years on his/ her job? What is the probability that 100 people selected at random spend an average of less than 5 years on their job? STA 320 - Fall 2013-38

Review: Multiple Choice Question The Central Limit Theorem implies that 1. All variables have approximately bell-shaped sample distributions if a random sample contains at least 30 observations 2. Population distributions are normal whenever the population size is large 3. For large random samples, the sampling distribution of is approximately normal, regardless of the shape of the population distribution 4. The sampling distribution looks more like the population distribution as the sample size increases 5. All of the above STA 320 Fall 2013 39