Central Limit Theorem

Similar documents
Making Sense of Cents

As you draw random samples of size n, as n increases, the sample means tend to be normally distributed.

CH 5 Normal Probability Distributions Properties of the Normal Distribution

STA 320 Fall Thursday, Dec 5. Sampling Distribution. STA Fall

CHAPTER 5 Sampling Distributions

Module 4: Probability

STAT:2010 Statistical Methods and Computing. Using density curves to describe the distribution of values of a quantitative

3.3-Measures of Variation

Determining Sample Size. Slide 1 ˆ ˆ. p q n E = z α / 2. (solve for n by algebra) n = E 2

Chapter 6: The Normal Distribution

Chapter 7 Sampling Distributions and Point Estimation of Parameters

Chapter 6: The Normal Distribution

Chapter 7 Study Guide: The Central Limit Theorem

The Normal Distribution

NORMAL RANDOM VARIABLES (Normal or gaussian distribution)

Expected Value of a Random Variable

MA131 Lecture 8.2. The normal distribution curve can be considered as a probability distribution curve for normally distributed variables.

Distribution of the Sample Mean

ECON 214 Elements of Statistics for Economists 2016/2017

Using the Central Limit Theorem It is important for you to understand when to use the CLT. If you are being asked to find the probability of the

Chapter Seven. The Normal Distribution

Binomial Distribution. Normal Approximation to the Binomial

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Lecture 9. Probability Distributions. Outline. Outline

Statistics for Managers Using Microsoft Excel 7 th Edition

7 THE CENTRAL LIMIT THEOREM

Lecture 9. Probability Distributions

ECON 214 Elements of Statistics for Economists

4.2 Probability Distributions

Applications of the Central Limit Theorem

Problem Set 08 Sampling Distribution of Sample Mean

Section 7.5 The Normal Distribution. Section 7.6 Application of the Normal Distribution

Chapter 15: Sampling distributions

Section 6.5. The Central Limit Theorem

Normal Model (Part 1)

Figure 1: 2πσ is said to have a normal distribution with mean µ and standard deviation σ. This is also denoted

The normal distribution is a theoretical model derived mathematically and not empirically.

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS

Chapter 8 Homework Solutions Compiled by Joe Kahlig. speed(x) freq 25 x < x < x < x < x < x < 55 5

Sampling and sampling distribution

Statistics for Business and Economics: Random Variables:Continuous

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

*****CENTRAL LIMIT THEOREM (CLT)*****

Section 3.4 The Normal Distribution

5.1 Mean, Median, & Mode

Chapter 6. The Normal Probability Distributions

1. Variability in estimates and CLT

Using the Central Limit

What type of distribution is this? tml

6.2 Normal Distribution. Normal Distributions

The Normal Probability Distribution

Chapter 4. The Normal Distribution

The Central Limit Theorem. Sec. 8.2: The Random Variable. it s Distribution. it s Distribution

Week 7. Texas A& M University. Department of Mathematics Texas A& M University, College Station Section 3.2, 3.3 and 3.4

3 3 Measures of Central Tendency and Dispersion from grouped data.notebook October 23, 2017

1 Sampling Distributions

Section Distributions of Random Variables

The probability of having a very tall person in our sample. We look to see how this random variable is distributed.

WebAssign Math 3680 Homework 5 Devore Fall 2013 (Homework)

The Central Limit Theorem for Sample Means (Averages)

Probability is the tool used for anticipating what the distribution of data should look like under a given model.

Statistics, Their Distributions, and the Central Limit Theorem

work to get full credit.

Distribution. Lecture 34 Section Fri, Oct 31, Hampden-Sydney College. Student s t Distribution. Robb T. Koether.

Confidence Intervals. σ unknown, small samples The t-statistic /22

Normal Probability Distributions

Introduction to Statistics I

When we look at a random variable, such as Y, one of the first things we want to know, is what is it s distribution?

Shifting and rescaling data distributions

5-1 pg ,4,5, EOO,39,47,50,53, pg ,5,9,13,17,19,21,22,25,30,31,32, pg.269 1,29,13,16,17,19,20,25,26,28,31,33,38

AP * Statistics Review

Chapter 5. Sampling Distributions

Probability. An intro for calculus students P= Figure 1: A normal integral

Chapter 8: The Binomial and Geometric Distributions

Chapter 8. Binomial and Geometric Distributions

Discrete Random Variables

Discrete Random Variables

The Central Limit Theorem

Statistics and Probability

Math 227 Elementary Statistics. Bluman 5 th edition

Refer to Ex 3-18 on page Record the info for Brand A in a column. Allow 3 adjacent other columns to be added. Do the same for Brand B.

Elementary Statistics

(# of die rolls that satisfy the criteria) (# of possible die rolls)

Lecture 8 - Sampling Distributions and the CLT

PROBABILITY DISTRIBUTIONS. Chapter 6

Chapter 6: Random Variables. Ch. 6-3: Binomial and Geometric Random Variables

When we look at a random variable, such as Y, one of the first things we want to know, is what is it s distribution?

Sampling Distributions

CHAPTER 6 Random Variables

6 Central Limit Theorem. (Chs 6.4, 6.5)

MATH 264 Problem Homework I

Sampling Distributions

Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc.

Study Ch. 7.3, # 63 71

σ. Step 3: Find the corresponding area using the Z-table.

3.1 Measures of Central Tendency

Statistics 13 Elementary Statistics

Chapter 7 1. Random Variables

Found under MATH NUM

Transcription:

Central Limit Theorem Lots of Samples 1

Homework Read Sec 6-5. Discussion Question pg 329 Do Ex 6-5 8-15 2

Objective Use the Central Limit Theorem to solve problems involving sample means 3

Sample Means If we were trying to find the mean GPA for students at CHHS we could randomly sample 50 students and find the mean GPA. This would be an estimate of the GPA for the population of CHHS. We can take several more samples of 50 students to try to improve our estimate. What happens when we look at the means of these multiple samples? The means of many samples become data values in a distribution of statistics (summaries). 4

Population In a large population of size N we can find N C n different randomly selected (with replacement) samples of size n. We can then look at a sampling distribution (of sample means). The means of these samples will usually differ from µ even though each sample mean ( ) is an estimate of µ. X The difference between X and µ is due to sampling error or sampling variability. 5

Sample of Sample Means Of all the possible samples, let us take several. Not so few, but certainly not all of the possible samples. We would now have a new distribution. A distribution of sample means. A sampling distribution. This distribution of sample means has... a mean and standard deviation. 6

µ and σ We would expect that the mean of all possible sample means would equal the population mean. µ X = µ The standard deviation of the sample means would NOT be equal to the population standard deviation. Obviously the sample means would tend to be much more alike (closer together) than the actual observations. The reduced variation results in a smaller standard deviation. σ X = σ, n = size of samples n Take it on faith, we are not going to develop the formula for σ X 7

Example Suppose we roll a die 6 times with results 1, 2, 3, 4, 5, 6. Unlikely but let us accept that as the result for the moment. Using your calculator find the mean and standard deviation of our population. µ = 3.5, and σ 1.7078. Record these values for future comparison. 8

Samples Now we take all the samples of size 2 from our population Sample Mean Sample Mean Sample Mean 1, 1 1 1, 2 1.5 1, 3 2 1, 4 2.5 1, 5 3 1, 6 3.5 2, 1 1.5 2, 2 2 2, 3 2.5 2, 4 3 2, 5 3.5 2, 6 4 3, 1 2 3, 2 2.5 3, 3 3 3, 4 3.5 3, 5 4 3, 6 4.5 4, 1 2.5 4, 2 3 4, 3 3.5 4, 4 4 4, 5 4.5 4, 6 5 5, 1 3 5, 2 3.5 5, 3 4 5, 4 4.5 5, 5 5 5, 6 5.5 6, 1 3.5 6, 2 4 6, 3 4.5 6, 4 5 6, 5 5.5 6, 6 6 9

Distribution of Sample Means Mean f Condensing the table 1 1 1.5 2 This is now a frequency distribution of sample means. 2 3 2.5 4 3 5 This sampling distribution is a new distribution of the means from all of the samples of size 2. 3.5 6 4 5 On your calculator do a histogram of the distribution. 4.5 4 5 3 5.5 2 6 1 10

Distribution of Sample Means The distribution of sample means is unimodal and Mean f symmetric, or approximately normally distributed. 6 4.5 3 1 1 1.5 2 2 3 2.5 4 3 5 3.5 6 4 5 1.5 0 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 4.5 4 5 3 5.5 2 6 1 11

Mean and Standard Deviation Mean f On your calculator find the mean and standard 1 1 deviation of the distribution of sample means. 1.5 2 2 3 µ X = 3.5 σ X = 1.2076 2.5 4 3 5 3.5 6 4 5 This is consistent with the values from our original population. µ = 3.5, and σ 1.7078. 4.5 4 5 3 σ X = σ n = 1.7078 = 1.2076 5.5 2 2 6 1 12

Central Limit Theorem As sample size (n) taken randomly increases (with replacement) from a population with mean µ and standard deviation σ, the shape of the distribution of sample means will approach a normal distribution with: µ X = µ σ X = σ n, n = size of samples Since the distribution of sample means is normal, the central limit theorem allows us to ask questions about sample means, just like the questions asked about raw data, including using z-scores. 13

Limitations If the original data from which samples were taken are normally distributed, then a distribution of sample means will be normally distributed If the original data from which samples were taken are not normally distributed, then the samples must be large enough (n 30) to ensure the distribution of means is normal. This is a remarkable and incredibly useful fact. Even if the original population is not normally distributed, the sampling distribution will be sufficiently normal if the sample size is large enough. The more the original observations deviate from unimodal and symmetric the larger the samples need to be. Thus n > 30 should suffice for all but the most extreme cases. 14

Z-scores We can now use z-scores to draw inferences about sample means. Such as, if 20 people are chosen at random from a population with mean 100 and standard deviation 15, what is the probability the mean of the sample will be greater than 110? Note that this is a different question than, If people are chosen at random from a population with mean 100 and standard deviation 15, what is the probability an individual selected at random will score greater than 110? Which question has the greater probability? 15

Sample We calculate z-scores as always, only differing by keeping in mind the new standard deviation. 110.0014 z = X µ 110 100 = 2.9814 σ 15 n 20 P (z 2.98) = Normalcdf(2.98, 9, 0, 1) =.0014 90-3σ 93.3 96.7 100 103.4 106.7 110.1-2σ -1σ 0 1σ 2σ 3σ Note what the change in σ has done to the curve. P (X 110) = Normalcdf(110, 10^99, 100, 15/ 20) =.0014 The probability a sample of 20 people will have a mean score greater than 110 is.0014. Not very likely. 16

Individual If we calculate the probability of an individual having a score 110. 110 z = X µ σ = 110 100 15.6667.2525 P (z.6667) = Normalcdf(.6667, 9, 0, 1) =.2525 55-3σ 70-2σ 85-1σ 100 115 130 145 0 1σ 2σ 3σ P (x 110) = Normalcdf(110, 10^99, 100, 15) =.2525 The probability an individual will have a mean score greater than 110 is.2525. Much more likely. When calculating probabilities; first determine if you are calculating for an individual observation, or a sample summary statistic. 17

Example Assume the appropriate mean systolic blood pressure of an adult is 120 mm/hg with a standard deviation of 5.6 mm/hg. If blood pressure is normally distributed find the probability that an individual selected at random would have systolic pressure between 120 and 121.8 mm/hg. 121.8.1261 z = X µ σ = 121.8 120 5.6.3214 103.2-3σ 108.8-2σ 114.4-1σ 120 125.6 131.2 136.8 0 1σ 2σ 3σ P(120 x 121.8) = Normalcdf(120, 121.8, 120, 5.6) =.1261 P(0 z.3214) = Normalcdf(0,.3214, 0, 1) =.1261 The probability an individual would have a systolic blood pressure between 120 and 121.8 mm/hg is.1261 18

Example Assume the appropriate mean systolic blood pressure of an adult is 120 mm/hg with a standard deviation of 5.6. If blood pressure is normally distributed find the probability that a sample of 30 people would have a mean systolic pressure between 120 and 121.8 mm Hg. 121.8 σ = 5.6 X 30 = 1.0224 z = X µ σ = 121.8 120 5.6 30 1.7605 117-3σ 118-2σ 119-1σ.1261 120 121 122 123 0 1σ 2σ 3σ P(120 x 121.8) = Normalcdf(120, 121.8, 120, 5.6/ 30) =.4608 P(0 z 1.0224) = Normalcdf(0, 1.7605, 0, 1) =.4608 The probability that a sample of 30 people would have a mean systolic pressure between 120 and 121.8 mm Hg is.4608. 19