MLLunsford 1. Activity: Central Limit Theorem Theory and Computations

Similar documents
Activity #17b: Central Limit Theorem #2. 1) Explain the Central Limit Theorem in your own words.

Making Sense of Cents

Central Limit Theorem, Joint Distributions Spring 2018

MLLunsford 1. Activity: Mathematical Expectation

As you draw random samples of size n, as n increases, the sample means tend to be normally distributed.

Probability. An intro for calculus students P= Figure 1: A normal integral

Module 4: Probability

Figure 1: 2πσ is said to have a normal distribution with mean µ and standard deviation σ. This is also denoted

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Chapter 7 Study Guide: The Central Limit Theorem

Lecture 9. Probability Distributions. Outline. Outline

Lecture 9. Probability Distributions

Chapter 7 Sampling Distributions and Point Estimation of Parameters

Uniform Probability Distribution. Continuous Random Variables &

The Binomial Probability Distribution

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS

Central Limit Theorem (cont d) 7/28/2006

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

Elementary Statistics Lecture 5

STA 220H1F LEC0201. Week 7: More Probability: Discrete Random Variables

Sampling Distribution

STA Module 3B Discrete Random Variables

BIOL The Normal Distribution and the Central Limit Theorem

Chapter 5. Sampling Distributions

CHAPTER 8 PROBABILITY DISTRIBUTIONS AND STATISTICS

Section Introduction to Normal Distributions

Chapter 6 Probability

Statistics for Business and Economics

The normal distribution is a theoretical model derived mathematically and not empirically.

ECON 214 Elements of Statistics for Economists 2016/2017

Midterm Exam III Review

Chapter 5 Probability Distributions. Section 5-2 Random Variables. Random Variable Probability Distribution. Discrete and Continuous Random Variables

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.

STAT 241/251 - Chapter 7: Central Limit Theorem

ECON 214 Elements of Statistics for Economists

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

The topics in this section are related and necessary topics for both course objectives.

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

Chapter 5. Continuous Random Variables and Probability Distributions. 5.1 Continuous Random Variables

Chapter 4 Probability Distributions

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

2011 Pearson Education, Inc

STA Rev. F Learning Objectives. What is a Random Variable? Module 5 Discrete Random Variables

AMS 7 Sampling Distributions, Central limit theorem, Confidence Intervals Lecture 4

Lecture 6: Chapter 6

Normal Distribution. Notes. Normal Distribution. Standard Normal. Sums of Normal Random Variables. Normal. approximation of Binomial.

Business Statistics 41000: Probability 3

Binomial Random Variables. Binomial Random Variables

6.3: The Binomial Model

STAT 201 Chapter 6. Distribution

Chapter 7. Sampling Distributions and the Central Limit Theorem

Section 6.5. The Central Limit Theorem

No, because np = 100(0.02) = 2. The value of np must be greater than or equal to 5 to use the normal approximation.

MATH 10 INTRODUCTORY STATISTICS

IEOR 3106: Introduction to OR: Stochastic Models. Fall 2013, Professor Whitt. Class Lecture Notes: Tuesday, September 10.

Review of the Topics for Midterm I

MAKING SENSE OF DATA Essentials series

Chapter 5. Statistical inference for Parametric Models

Using the Central Limit Theorem It is important for you to understand when to use the CLT. If you are being asked to find the probability of the

PROBABILITY DISTRIBUTIONS

. (i) What is the probability that X is at most 8.75? =.875

Chapter 9: Sampling Distributions

The Central Limit Theorem. Sec. 8.2: The Random Variable. it s Distribution. it s Distribution

UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions.

STAT Chapter 7: Central Limit Theorem

Part V - Chance Variability

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series

Probability is the tool used for anticipating what the distribution of data should look like under a given model.

Review for Final Exam Spring 2014 Jeremy Orloff and Jonathan Bloom

4: Probability. Notes: Range of possible probabilities: Probabilities can be no less than 0% and no more than 100% (of course).

Chapter 6: Discrete Probability Distributions

Commonly Used Distributions

Homework Assignments

Counting Basics. Venn diagrams

Central Limit Theorem 11/08/2005

8.1 Estimation of the Mean and Proportion

Section 7.5 The Normal Distribution. Section 7.6 Application of the Normal Distribution

Overview. Definitions. Definitions. Graphs. Chapter 4 Probability Distributions. probability distributions

CH 5 Normal Probability Distributions Properties of the Normal Distribution

Introduction to Business Statistics QM 120 Chapter 6

Value (x) probability Example A-2: Construct a histogram for population Ψ.

Statistics 6 th Edition

THE UNIVERSITY OF TEXAS AT AUSTIN Department of Information, Risk, and Operations Management

Chapter 7. Sampling Distributions

Stat511 Additional Materials

Discrete Random Variables

Discrete Random Variables

Discrete Random Variables

Discrete Random Variables and Probability Distributions

Confidence Intervals: Review

Statistical Methods in Practice STAT/MATH 3379

5.1 Personal Probability

4.2 Probability Distributions

Normal Probability Distributions

Introduction to Probability and Inference HSSP Summer 2017, Instructor: Alexandra Ding July 19, 2017

Chapter 7. Sampling Distributions and the Central Limit Theorem

4.2 Bernoulli Trials and Binomial Distributions

ECO220Y Continuous Probability Distributions: Normal Readings: Chapter 9, section 9.10

STA 320 Fall Thursday, Dec 5. Sampling Distribution. STA Fall

*****CENTRAL LIMIT THEOREM (CLT)*****

Transcription:

MLLunsford 1 Activity: Central Limit Theorem Theory and Computations Concepts: The Central Limit Theorem; computations using the Central Limit Theorem. Prerequisites: The student should be familiar with the ideas of the Central Limit Theorem; expected value; statistics such as the sample mean and sample variance; and using the normal distribution to find probabilities. Recap: In our last activity (Sampling Distributions and Introduction to the Central Limit Theorem) we used simulation to examine the sampling distribution for the sample mean statistic. First we saw that the sample mean is a random variable. Our investigation of the empirical probability distribution (aka sampling distribution) of by taking many samples of the same size, n, from the same population resulted in the following observations about the sampling distribution of : Population Parameters: mean = µ, standard deviation = σ; Sample Statistics: mean = x, standard deviation = s Observations about the sampling distribution of : shape: Bell shaped (i.e. normal shaped) distribution for large enough sample sizes, n. center: Distribution of centered at the population mean µ. spread: Spread of depends on sample size, n. Spread decreases as n increases (actually spread is σ/sqrt(n)). Conclusion: The distribution of the sample mean,, will be centered at the population mean and shaped like a normal distribution if n is large or the population is normal to begin with. Our simulation results to compute the sampling distributions for the sample mean statistic illustrated the Central Limit Theorem. This theorem says the following about the sampling distribution of the sample mean : The mean of the sampling distribution of equals the population mean µ, regardless of the sample size or the population distribution. The standard deviation of the sampling distribution of equals the population standard deviation σ divided by the square root of the sample size, regardless of the population distribution. The shape of the sampling distribution of is approximately normal for large sample sizes, regardless of the population distribution, and it is normal for any sample size when the population distribution is normal. In this activity sheet, we are going to see why parts of the Central Limit Theorem are true and learn how to use this theorem in computations. First let s recall what we mean by a random sample from a distribution (population): If i, i=1,..,n are n independent observations from the same distribution (population), then i, i=1,..,n is a random sample of size n from that common distribution (population).

MLLunsford Examples of Random Samples: Coin Flipping: Suppose we flip a fair coin 10 times and let the random variable denote the number of heads. Then is b(10,0.5) with E()= and Var()=. Now suppose we run this experiment 0 times and observe the value of on each run of the experiment, letting i, denote the number of heads on the ith run of the experiment. Then i, i=1,..,0 is a random sample of size 0 from the b(10, 0.5) distribution. Note that the sample mean of this random sample is given by (1/n)Σ( i ) (where the summation is from i=1,..,0 and n=0). Polling: Suppose we randomly select 1000 Americans and ask them if they approve of the job the President is doing. Let i =1 if the ith American selected approves, zero otherwise. Then i, i=1,..,1000 is a random sample of size 1000 from the Bernoulli distribution where the parameter p is the proportion of all Americans that approve. What is the expected value of this Bernoulli distribution? What is the standard deviation of this Bernoulli distribution? How is the sample mean of this random sample defined? What does the sample mean of the sample represent? If we let the random variable be the number of successes (i.e. number who approve) out of the 1000 samples, then how is distributed? (Hint: Think Bernoulli Trial!) Penny Ages: In part (c) of the Penny Ages scenario of Sampling Distributions and Introduction to the Central Limit Theorem activity, we repeatedly (i.e. 500 times) got random samples of size n=5 from a population with mean and standard deviation. For each of these random samples we computed the sample mean. Professor Lectures Overtime: In part (h) of the Professor Lectures Overtime scenario of Sampling Distributions and Introduction to the Central Limit Theorem activity, we repeatedly got random samples of size from a population with a distribution with distribution mean and distribution standard deviation. Again, for each of these random samples we computed the sample mean statistic. Theory: Now, to see why the first two bullets of the Central Limit Theorem are true let s recall some results for expected value: Let be a random variable, then we have the following rules (proven on the bottom of page 15 of your text by using Theorem 3.-1 on page 11 of your text). Note: Make sure you can reproduce these rules if you are given Theorem 3.-1. Rules for Expected Value: E(a+b) = ae()+b Rules for Variance: V(a+b) = a V() A generalization of these facts for more than one random variable can be found on page 94 of your text in Theorem 6.-3:

MLLunsford 3 Theorem 6.-3: With more than one random variable, E(a 1 1 +a + +a n n ) = a 1 E( 1 ) +a E( )+ +a n E( n ) = Σa i E( i ) (NOTE: You do not need the random variables, i, to be independent for this result to hold.) If the random variables are independent, then V(a 1 1 +a + +a n n ) = a 1 V( 1 ) +a V( )+ +a n V( n ) = Σa i V( i ) (since we will be working with random samples (i.e. each i can be considered to be an independent observation from the same distribution!) then the i can be considered independent!). (a) Use the facts above to show that E( ) = µ when the i are a random sample from a distribution with mean µ. (Hint: Use the definition of and Theorem 6.-3. This is shown after Example 6.-4 on page 95 of your text. Try to show it before looking at the answer!). (b) Use the facts above to show that expression for Var( ) in terms of the population standard deviation σ. (Hint: This is also shown after Example 6.-4 on page 95 of your text. Try to prove it before looking at the answer!). Does this expression support your observation that the standard deviation of the sample mean decreases as the sample size n increases? (c) How did the above derivations depend on the population size? On the shape of the population? Applying the Central Limit Theorem: Let s examine the third bullet in our statement of the Central Limit Theorem above. First note that if the distribution from which you are sampling (i.e. the population distribution) is normal, say with mean = µ and standard deviation = σ, i.e. the population is N(µ, σ ), then no matter how small the sample size n, the distribution of the sample mean,, is given by N(µ, σ ). This is Theorem 6.3-1 on page 99 of your text. Note this says that E( )=µ and the standard deviation of is σ / n. Use this result to find the distribution of for the Professor Lectures Overtime example above: / n

MLLunsford 4 You should get that has the distribution: N(5, 5 and standard deviation 1.804/sqrt(5)). (1.804) / 5 ) (i.e. it is normal with mean The Consequence of All This: You can standardize and use normal distribution tables in the back of your textbook to calculate probabilities for the sample mean! A Worked Example: (a) For the Professor Lectures Overtime example above, find the probability that the amount of time the professor will lecture overtime is less than 5.5 minutes. Carefully define your random variables. Answer: Let be the amount of time the professor lectures after class should have ended. We N (5,(1.804) ) are given that is normally distributed:. Thus 5 5.5 5 P( < 5.5 ) = P < = P( Z <.77) =.609 (where Z is standard 1.804 1.804 normal). (b) Now suppose you observe the professor for five days and record her overtime amount on each day. Note: We are assuming the amount of time the professor lectures overtime is independent from day to day. What is the probability that the average of these times is less than 5.5 minutes? Carefully define your random variables. Answer: We have taken a random sample of size 5 from the have computed the sample mean of that sample, say N (5,(1.804) ) distribution and x. From the Central Limit Theorem, since we are sampling from a normal distribution, then we know that is N (5, ( 1.804) / 5 ). Thus 5 5.5 5 P( < 5.5 ) = P < = P( Z < 1.386) 1.804 1.804 5 5 = 0.9171 where Z is standard normal (Computation done via Minitab). Note this is the probability that the average amount of time the professor lectures overtime in five independent lectures is less than 5.5 minutes. (c) Compare the answers to (a) and (b). Which is larger? Why does this make sense? Now suppose you randomly observe the professor for 40 days and record her overtime amount on each day. How will the probability that the average of these times is less than 5.5 minutes compare to the probabilities found in (a) and (b)? Explain why.

MLLunsford 5 (d) Now consider the case where the random sample comes from a population with a distribution that is not normal but has finite mean µ and standard deviation σ. By the first two bullets of the Central Limit Theorem above, we know that the mean of the sampling distribution of equals the population mean µ and the standard deviation of the sampling distribution of equals the population standard deviation σ divided by the square root of the sample size. The third bullet of the Central Limit Theorem above says that as the sample size increases, i.e. as, then the distribution of approaches a normal distribution with mean µ and n standard deviation σ / n µ σ n. This is the same thing as saying that the random variable becomes standard normal as. This is essentially the statement of the Central n Limit Theorem on page 308 of your text (Theorem 6.4-1). How large does n need to be before we can use the normal distribution to approximate the distribution of? (See the paragraph in the center of page 309 of your text for an answer to this question.) (e) Use the Central Limit Theorem to determine an approximate distribution of the sample mean for the Polling example above. Answer: is approximately normal with mean p and standard deviation p(1 p) 1000. (f) Recall that the sample mean in the Polling example above represented the proportion of people in the sample that approved of the President s performance. Let s call that proportion, i.e. 1000 1 p= = 1000 p = 0.60 i= 1 i. Then find the approximate probability that p p P.58 ( < p<.6) if. ( Answer: Hint convert to a z-score and use the normal distribution. Answers: 0.803 via Minitab, 0.8030 via tables) Note: Examples 6.4-1 through 6.4-3 on page 308-9 of your text are also examples of using the CLT for computations

MLLunsford 6 Scenario: Selling Aircraft Communication Units Suppose a communications company sells aircraft communication units to civilian markets. Each month s sales depend on market conditions that cannot be predicted exactly, but the company executives predict their sales through the following probability estimates: x 5 40 65 p(x).4.5.1 where x number of units sold. (a) What is the expected number of units sold in one month = µ = E()? (b) Determine the variance, σ, of the number of units sold per month. (c) Suppose we wanted to examine the average number of units sold per month, say, for 3 years (n=36 months). Based on the central limit theorem (and assuming the number of units sold from month to month is independent), what can you say about the sampling distribution of? Also draw as sketch of this sampling distribution and be sure to indicate a label and numerical scale on the horizontal axis. (d) Use the above to approximate the probability that the average number of units sold per month in 36 months is 40 or higher. You can first use the above mean and standard deviation to standardize 40 and use the tables in the back of your book. Or use Minitab and choose Calc > Probability Distributions > Normal. Use Cumulative probability and specify the appropriate mean and standard deviation for the sampling distribution, entering 40 as the input constant. Be sure to use proper notation to express this probability as well (P( >40)) and shade the corresponding area in the above graph of the distribution of. (f) Would this probability increase or decrease (or stay the same) if the number of months were to increase? Explain. (g) Use the CLT to approximate the probability that the mean number of units sold in 36 months is between 35 and 40.