Sampling Distributions

Similar documents
STA258H5. Al Nosedal and Alison Weir. Winter Al Nosedal and Alison Weir STA258H5 Winter / 41

Distribution of the Sample Mean

CHAPTER 5 SAMPLING DISTRIBUTIONS

Sampling and sampling distribution

SAMPLING DISTRIBUTIONS. Chapter 7

Elementary Statistics Lecture 5

*****CENTRAL LIMIT THEOREM (CLT)*****

Chapter 7 Sampling Distributions and Point Estimation of Parameters

Section The Sampling Distribution of a Sample Mean

Chapter 7: Point Estimation and Sampling Distributions

Chapter 9: Sampling Distributions

Chapter 7. Sampling Distributions and the Central Limit Theorem

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS

Honor Code: By signing my name below, I pledge my honor that I have not violated the Booth Honor Code during this examination.

Lecture 6: Chapter 6

Normal Cumulative Distribution Function (CDF)

Sampling. Marc H. Mehlman University of New Haven. Marc Mehlman (University of New Haven) Sampling 1 / 20.

Hypothesis Tests: One Sample Mean Cal State Northridge Ψ320 Andrew Ainsworth PhD

As you draw random samples of size n, as n increases, the sample means tend to be normally distributed.

Chapter 9. Sampling Distributions. A sampling distribution is created by, as the name suggests, sampling.

Statistics, Their Distributions, and the Central Limit Theorem

Chapter 7. Sampling Distributions

Chapter 7. Sampling Distributions and the Central Limit Theorem

Lecture 3. Sampling distributions. Counts, Proportions, and sample mean.

Data Analysis and Statistical Methods Statistics 651

ECO220Y Estimation: Confidence Interval Estimator for Sample Proportions Readings: Chapter 11 (skip 11.5)

ECO220Y Sampling Distributions of Sample Statistics: Sample Proportion Readings: Chapter 10, section

Chapter 5: Statistical Inference (in General)

1. Variability in estimates and CLT

1. Covariance between two variables X and Y is denoted by Cov(X, Y) and defined by. Cov(X, Y ) = E(X E(X))(Y E(Y ))

Introduction to Statistical Data Analysis II

Discrete Random Variables

Discrete Random Variables

Business Statistics 41000: Probability 4

8.1 Estimation of the Mean and Proportion

Chapter 7 Study Guide: The Central Limit Theorem

Lecture 5: Sampling Distributions

The Binomial Probability Distribution

Discrete Random Variables

Education Assistance In your opinion is the cost to a student of a university education a good long-term investment or not?

STA215 Confidence Intervals for Proportions

Chapter 14 : Statistical Inference 1. Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same.

Previously, when making inferences about the population mean, μ, we were assuming the following simple conditions:

1 Sampling Distributions

Standard Normal, Inverse Normal and Sampling Distributions

One sample z-test and t-test

Sampling Distributions

Confidence Intervals and Sample Size

AMS 7 Sampling Distributions, Central limit theorem, Confidence Intervals Lecture 4

Random Variables and Probability Distributions

STA 320 Fall Thursday, Dec 5. Sampling Distribution. STA Fall

Statistics for Business and Economics: Random Variables:Continuous

The Central Limit Theorem. Sec. 8.2: The Random Variable. it s Distribution. it s Distribution

Chapter 5. Sampling Distributions

Sampling Distribution Models. Copyright 2009 Pearson Education, Inc.

Statistics 13 Elementary Statistics

Chapter 7 presents the beginning of inferential statistics. The two major activities of inferential statistics are

A continuous random variable is one that can theoretically take on any value on some line interval. We use f ( x)

Estimation Y 3. Confidence intervals I, Feb 11,

3. Probability Distributions and Sampling

Chapter 11: Inference for Distributions Inference for Means of a Population 11.2 Comparing Two Means

χ 2 distributions and confidence intervals for population variance

2011 Pearson Education, Inc

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

Estimating parameters 5.3 Confidence Intervals 5.4 Sample Variance

Confidence Intervals. σ unknown, small samples The t-statistic /22

STAT 509: Statistics for Engineers Dr. Dewei Wang. Copyright 2014 John Wiley & Sons, Inc. All rights reserved.

Statistics 511 Additional Materials

Midterm Exam III Review

Chapter 7: Sampling Distributions Chapter 7: Sampling Distributions

Reminders. Quiz today - please bring a calculator I ll post the next HW by Saturday (last HW!)

Name PID Section # (enrolled)

Stat 213: Intro to Statistics 9 Central Limit Theorem

Chapter Seven. The Normal Distribution

Sampling Distributions For Counts and Proportions

6. a) ABC ABD ABE ABO ACD ACE ACO ADE ADO AEO BCD BCE BCO BDE BDO BEO CDE CDO CEO DEO

The Normal Approximation to the Binomial

Unit 5: Sampling Distributions of Statistics

The Central Limit Theorem (Solutions) COR1-GB.1305 Statistics and Data Analysis

Unit 5: Sampling Distributions of Statistics

BUSINESS MATHEMATICS & QUANTITATIVE METHODS

Data Analysis and Statistical Methods Statistics 651

Chapter 8 Statistical Intervals for a Single Sample

BIOL The Normal Distribution and the Central Limit Theorem

Chapter 9 & 10. Multiple Choice.

Section 0: Introduction and Review of Basic Concepts

AMS7: WEEK 4. CLASS 3

Sampling Distributions

Tutorial 6. Sampling Distribution. ENGG2450A Tutors. 27 February The Chinese University of Hong Kong 1/6

Statistics and Probability

Chapter 4 Random Variables & Probability. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

Introduction to Statistics I

ECON 214 Elements of Statistics for Economists 2016/2017

Measure of Variation

Data Analytics (CS40003) Practice Set IV (Topic: Probability and Sampling Distribution)

BIO5312 Biostatistics Lecture 5: Estimations

Populations and Samples Bios 662

Random Variables CHAPTER 6.3 BINOMIAL AND GEOMETRIC RANDOM VARIABLES

HOMEWORK: Due Mon 11/8, Chapter 9: #15, 25, 37, 44

Transcription:

Al Nosedal. University of Toronto. Fall 2017 October 26, 2017

1 What is a Sampling Distribution? 2 3

Sampling Distribution The sampling distribution of a statistic is the distribution of values taken by the statistic in all possible samples of the same size from the same population.

Toy Problem We have a population with a total of six individuals: A, B, C, D, E and F. All of them voted for one of two candidates: Bert or Ernie. A and B voted for Bert and the remaining four people voted for Ernie. Proportion of voters who support Bert is p = 2 6 = 33.33%. This is an example of a population parameter.

Toy Problem We are going to estimate the population proportion of people who voted for Bert, p, using information coming from an exit poll of size two. Ultimate goal is seeing if we could use this procedure to predict the outcome of this election.

List of all possible samples {A,B} {B,C} {C,E} {A,C} {B,D} {C,F} {A,D} {B,E} {D,E} {A,E} {B,F} {D,F} {A,F} {C,D} {E,F}

Sample proportion The proportion of people who voted for Bert in each of the possible random samples of size two is an example of a statistic. In this case, it is a sample proportion because it is the proportion of Bert s supporters within a sample; we use the symbol ˆp (read p-hat ) to distinguish this sample proportion from the population proportion, p.

List of possible estimates ˆp 1 ={A,B} = {1,1}=100% ˆp 9 ={B,F} = {1,0}=50% ˆp 2 ={A,C} = {1,0}=50% ˆp 10 ={C,D} = {0,0}=0% ˆp 3 ={A,D} = {1,0}=50% ˆp 11 ={C,E}= {0,0}=0% ˆp 4 ={A,E}= {1,0}=50% ˆp 12 ={C,F} {0,0}=0% ˆp 5 ={A,F} = {1,0}=50% ˆp 13 ={D,E}{0,0}=0% ˆp 6 ={B,C} = {1,0}=50% ˆp 14 ={D,F}{0,0}=0% ˆp 7 ={B,D} = {1,0}=50% ˆp 15 ={E,F} {0,0}=0% ˆp 8 ={B,E} = {1,0}=50% mean of sample proportions = 0.3333 = 33.33%. standard deviation of sample proportions = 0.3333 = 33.33%.

Frequency table ˆp Frequency Relative Frequency 0 6 6/15 1/2 8 8/15 1 1 1/15

Sampling distribution of ˆp when n = 2. Sampling Distribution when n=2 Relative Freq. 0.0 0.1 0.2 0.3 0.4 0.5 0.6 6/15 8/15 1/15 0.0 0.2 0.4 0.6 0.8 1.0 p^

Predicting outcome of the election Proportion of times we would declare Bert lost the election using this procedure= 6 15 = 40%.

Problem (revisited) Next, we are going to explore what happens if we increase our sample size. Now, instead of taking samples of size 2 we are going to draw samples of size 3.

List of all possible samples {A,B,C} {A,C,E} {B,C,D} {B,E,F} {A,B,D} {A,C,F} {B,C,E} {C,D,E} {A,B,E} {A,D,E} {B,C,F} {C,D,F} {A,B,F} {A,D,F} {B,D,E} {C,E,F} {A,C,D} {A,E,F} {B,D,F} {D,E,F}

List of all possible estimates ˆp 1 = 2/3 ˆp 6 = 1/3 ˆp 11 = 1/3 ˆp 16 = 1/3 ˆp 2 = 2/3 ˆp 7 = 1/3 ˆp 12 = 1/3 ˆp 17 = 0 ˆp 3 = 2/3 ˆp 8 = 1/3 ˆp 13 = 1/3 ˆp 18 = 0 ˆp 4 = 2/3 ˆp 9 = 1/3 ˆp 14 = 1/3 ˆp 19 = 0 ˆp 5 = 1/3 ˆp 10 = 1/3 ˆp 15 = 1/3 ˆp 20 = 0 mean of sample proportions = 0.3333 = 33.33%. standard deviation of sample proportions = 0.2163 = 21.63%.

Frequency table ˆp Frequency Relative Frequency 0 4 4/20 1/3 12 12/20 2/3 4 4/20

Sampling distribution of ˆp when n = 3. Sampling Distribution when n=3 12/20 Relative Freq. 0.0 0.1 0.2 0.3 0.4 0.5 0.6 4/20 4/20 0.0 0.2 0.4 0.6 0.8 1.0 p^

Prediction outcome of the election Proportion of times we would declare Bert lost the election using this procedure= 16 20 = 80%.

More realistic example Assume we have a population with a total of 1200 individuals. All of them voted for one of two candidates: Bert or Ernie. Four hundred of them voted for Bert and the remaining 800 people voted for Ernie. Thus, the proportion of votes for Bert, which we will denote with p, is p = 400 1200 = 33.33%. We are interested in estimating the proportion of people who voted for Bert, that is p, using information coming from an exit poll. Our ultimate goal is to see if we could use this procedure to predict the outcome of this election.

Sampling distribution of ˆp when n = 10. Sampling Distribution when n=10 Relative Freq. 0 2 4 6 8 10 p=0.3333 0.0 0.2 0.4 0.6 0.8 1.0 p^

Sampling distribution of ˆp when n = 20. Sampling Distribution when n=20 Relative Freq. 0 2 4 6 8 10 p=0.3333 0.0 0.2 0.4 0.6 0.8 1.0 p^

Sampling distribution of ˆp when n = 30. Sampling Distribution when n=30 Relative Freq. 0 2 4 6 8 10 p=0.3333 0.0 0.2 0.4 0.6 0.8 1.0 p^

Sampling distribution of ˆp when n = 40. Sampling Distribution when n=40 Relative Freq. 0 2 4 6 8 10 p=0.3333 0.0 0.2 0.4 0.6 0.8 1.0 p^

Sampling distribution of ˆp when n = 50. Sampling Distribution when n=50 Relative Freq. 0 2 4 6 8 10 p=0.3333 0.0 0.2 0.4 0.6 0.8 1.0 p^

Sampling distribution of ˆp when n = 60. Sampling Distribution when n=60 Relative Freq. 0 2 4 6 8 10 p=0.3333 0.0 0.2 0.4 0.6 0.8 1.0 p^

Sampling distribution of ˆp when n = 70. Sampling Distribution when n=70 Relative Freq. 0 2 4 6 8 10 p=0.3333 0.0 0.2 0.4 0.6 0.8 1.0 p^

Sampling distribution of ˆp when n = 80. Sampling Distribution when n=80 Relative Freq. 0 2 4 6 8 10 p=0.3333 0.0 0.2 0.4 0.6 0.8 1.0 p^

Sampling distribution of ˆp when n = 90. Sampling Distribution when n=90 Relative Freq. 0 2 4 6 8 10 p=0.3333 0.0 0.2 0.4 0.6 0.8 1.0 p^

Sampling distribution of ˆp when n = 100. Sampling Distribution when n=100 Relative Freq. 0 2 4 6 8 10 p=0.3333 0.0 0.2 0.4 0.6 0.8 1.0 p^

Sampling distribution of ˆp when n = 110. Sampling Distribution when n=110 Relative Freq. 0 2 4 6 8 10 p=0.3333 0.0 0.2 0.4 0.6 0.8 1.0 p^

Sampling distribution of ˆp when n = 120. Sampling Distribution when n=120 Relative Freq. 0 2 4 6 8 10 p=0.3333 0.0 0.2 0.4 0.6 0.8 1.0 p^

Observation The larger the sample size, the more closely the distribution of sample proportions approximates a Normal distribution. The question is: Which Normal distribution?

Sampling Distribution of a sample proportion Draw an SRS of size n from a large population that contains proportion p of successes. Let ˆp be the sample proportion of successes, Then: ˆp = number of successes in the sample n The mean of the sampling distribution of ˆp is p. The standard deviation of the sampling distribution is p(1 p). n As the sample size increases, the sampling distribution of ˆp becomes approximately ( Normal. ) That is, for large n, ˆp has approximately the N p, distribution. p(1 p) n

Approximating Sampling Distribution of ˆp If the proportion of all voters that supports Bert is p = 1 3 = 33.33% and we are taking a random sample of size 120, the Normal distribution that approximates the sampling distribution of ˆp is: ( ) p(1 p) N p, n that is N (µ = 0.3333, σ = 0.0430) (1)

Sampling Distribution of ˆp vs Normal Approximation Normal approximation Relative Freq. 0 2 4 6 8 10 0.0 0.2 0.4 0.6 0.8 1.0 p^

Predicting outcome of the election with our approximation Proportion of times we would declare Bert lost the election using this procedure = Proportion of samples that yield a ˆp < 0.50. Let Y = ˆp, then Y has a Normal Distribution with µ = 0.3333 and σ = 0.0430. Proportion of samples that yield a ˆp < 0.50= P(Y < 0.50) = P ( Y µ σ < 0.5 0.3333 0.0430 ) = P(Z < 3.8767).

P(Z < 3.8767) 0.0 0.1 0.2 0.3 0.4 0.9999471 4 2 0 2 4

Predicting outcome of the election with our approximation This implies that roughly 99.99% of the time taking a random exit poll of size 120 from a population of size 1200 will predict the outcome of the election correctly, when p = 33.33%.

A few remarks What is a Sampling Distribution? It is the distribution that results when we find the proportions (ˆp) in all possible samples of a given size. Finding all possible samples of the size selected. Computing statistic of interest (sample proportion, for instance). Making a table of relative frequencies (or a graphical representation of it).

A few remarks It is impractical or too expensive to survey every individual in the population. It is reasonable to consider the idea of using a random sample to estimate a parameter. Sampling distributions help us to understand the behavior of a statistic when random sampling is used.

Example What is a Sampling Distribution? In the last election, a state representative received 52% of the votes cast. One year after the election, the representative organized a survey that asked a random sample of 300 people whether they would vote for him in the next election. If we assume that his popularity has not changed, what is the probability that more than half of the sample would vote for him?

Solution What is a Sampling Distribution? We want to determine the probability that the sample proportion is greater than 50%. In other words, we want to find P(ˆp > 0.50). We know that the sample proportion ˆp is roughly Normally distributed with mean p = 0.52 and standard deviation p(1 p)/n = (0.52)(0.48)/300 = 0.0288. Thus, we calculate( ) ˆp p P(ˆp > 0.50) = P > 0.50 0.52 p(1 p)/n 0.0288 = P(Z > 0.69) = 1 P(Z < 0.69) = 1 0.2451 = 0.7549. If we assume that the level of support remains at 52%, the probability that more than half the sample of 300 people would vote for the representative is 0.7549.

R code What is a Sampling Distribution? Just type the following: 1- pnorm(0.50, mean = 0.52, sd = 0.0288); ## [1] 0.7562982 In this case, pnorm will give you the area to the left of 0.50, for a Normal distribution with mean 0.52 and standard deviation 0.0288.

Mean and Standard Deviation of a Sample Mean Suppose that x is the mean of an SRS of size n drawn from a large population with mean µ and standard deviation σ. Then the sampling distribution of X has mean µ and standard deviation σ n.

Revisiting Assignment 2 Many variables important to the real estate market are skewed, limited to only a few values or considered as categorical variables. Yet, marketing and business decisions are often made on means and proportions calculated over many homes. One reason these statistics are useful is the Central Limit Theorem. Data on 1063 houses sold recently in the Saratoga, New York area are available at "http://www.math.unm.edu/~alvaro/real_estate.txt" Let s investigate how the CLT guarantees that the sampling distribution of means of a quantitative variable approaches the Normal distribution (even when samples are drawn from populations that are far from Normal).

Revisiting Assignment 2 a) Using R, create an object (vector) called areas using the entire population of 1063 homes for the quantitative variable Living.Area. Then make a histogram for this quantitative variable areas. Use: 1 a) your last name as the main title for your plot. (In my case it would be: 1 a) Nosedal). Describe the distribution (including its mean and standard deviation).

Revisiting Assignment 2 #Step 1. Entering data; # import data in R; # url of real_estate; real_estate_url= "http://www.math.unm.edu/~alvaro/real_estate.txt" real_estate= read.table(real_estate_url,header=true); names(real_estate) areas=real_estate$living.area;

Revisiting Assignment 2 # Step 2. Making histogram; hist(areas, main="distribution of Living Area (population)", col="blue"); # Step 3. Numerical summaries; fivenum(areas); mean(areas); sd(areas);

Distribution of Living Area (population) Frequency 0 50 100 200 300 1000 2000 3000 4000 5000 6000

Numerical summaries (population) ## [1] 672.0 1343.5 1680.0 2242.0 5632.0 ## [1] 1833.49 ## [1] 689.605

Revisiting Assignment 2 b) Using R, do the following: Draw 500 samples of size 100 from this population of homes and find the means of these samples. To do so, type the following commands in R: vec.means=rep(na,500); for (i in 1:500){ vec.means[i]=mean(sample(areas,100)) } Find the mean and standard deviation of this vector of means. Make a histogram of these 500 means. Use: 1 b) your last name as the main title for your plot. (In my case it would be: 1 b) Nosedal).

Solution b) vec.means=rep(na,500); # we are creating a blank vector of means; # we we will fill in this blank vector; for (i in 1:500){ vec.means[i]=mean(sample(areas,100)) } mean(vec.means); sd(vec.means);

Solution b) ## [1] 1833.528 ## [1] 64.12982

Histogram (vector of means) hist(vec.means, main="approximate Sampling distribution (x bar)", col="blue");

Approximate Sampling distribution (x bar) Frequency 0 20 40 60 80 120 1700 1800 1900 2000 2100

Solution b) Again... vec.means=rep(na,1000); # we are creating a blank vector of means; # we we will fill in this blank vector; for (i in 1:1000){ vec.means[i]=mean(sample(areas,100)) } mean(vec.means); sd(vec.means);

Solution b) ## [1] 1834.942 ## [1] 65.86645

Histogram (vector of means) hist(vec.means, main="approximate Sampling distribution (x bar)", col="blue");

Approximate Sampling distribution (x bar) Frequency 0 50 100 150 200 250 300 1700 1800 1900 2000 2100

Central Limit Theorem Draw an SRS of size n from any population with mean µ and standard deviation σ. The Central Limit Theorem (CLT) says that when n is large the sampling distribution of the sample mean x is approximately Normal: X is approximately N (µ, σ n ). The Central Limit Theorem allows us to use Normal probability calculations to answer questions about sample means from many observations.

0.000 0.002 0.004 0.006 Approximate Sampling distribution (x bar) vs CLT Normal 1700 1800 1900 2000 2100

Example What is a Sampling Distribution? A manufacturer of automobile batteries claims that the distribution of the lengths of life of its best battery has a mean of 54 months and a standard deviation of 6 months. Suppose a consumer group decides to check the claim by purchasing a sample of 50 of the batteries and subjecting them to tests that estimate the battery s life. a) Assuming that the manufacturer s claim is true, describe the sampling distribution of the mean lifetime of a sample of 50 batteries. b) Assuming that the manufacturer s claim is true, what is the probability that the consumer group s sample has a mean life of 52 or fewer months?

Solution a) We can use the Central Limit Theorem to deduce that the sampling distribution for a sample mean lifetime of 50 batteries is approximately Normally distributed. Furthermore, the mean of this sampling distribution (µ X ) is 54 months according to the manufacturer s claim. Finally, the standard deviation of the sampling distribution is given by σ X = σ n = 6 50 = 0.8485 month

Solution b) If the manufacturer s claim is true, the probability that the consumer group observes a mean battery life of 52 or fewer months for its sample of 50 batteries is given by P( X 52), where X is Normally distributed, µ X = 54 and σ X = σ n = 6 50 = 0.8485. Hence, P( X 52) = P ( X µ X σ X 52 54 0.8485 = P (Z 2.3571008) P (Z 2.36) (from Table 3) = 0.0091 )

Example What is a Sampling Distribution? The number of accidents per week at a hazardous intersection varies with mean 2.2 and standard deviation 1.4. This distribution takes only whole-number values, so it is certainly not Normal. a) Let x be the mean number of accidents per week at the intersection during a year (52 weeks). What is the approximate distribution of x according to the Central Limit Theorem? b) What is the approximate probability that x is less than 2? c) What is the approximate probability that there are fewer than 100 accidents at the intersection in a year?

Solution What is a Sampling Distribution? a) By the Central Limit Theorem, X is roughly Normal with mean µ = 2.2 and standard deviation σ = σ/ n = 1.4/ ( ) 52 = 0.1941. b) P( X < 2) = P X µ σ < 2 2.2 0.1941 = P(Z < 1.0303) = 0.1515

Solution What is a Sampling Distribution? Let X i be the number of( accidents during week i. 52 ) c) P(Total < 100) = P i=1 X i < 100 ( 52 ) i=1 = P X i 52 < 100 52 = P ( X < 1.9230 ) = P(Z < 1.4270) = 0.0768

Sampling distribution of the Difference between two means Statisticians have shown that the difference between two independent Normal random variables is also Normally distributed. Thus, the difference between two sample means X 1 X 2 is Normally distributed if both populations are Normal. By using the laws of expected value and variance we derive the expected value and variance of the sampling distribution of X 1 X 2 : µ X 1 X 2 = µ 1 µ 2 and σ 2 X1 X = σ2 1 2 n 1 + σ2 2 n 2

Starting Salaries of MBAs Suppose that the starting salaries of MBAs at Wilfrid Laurier University (WLU) are Normally distributed, with a mean of $62,000 and a standard deviation of $14,500. The starting salaries of MBAs at the University of Western Ontario (UWO) are Normally distributed, with a mean of $60,000 and a standard deviation of $18,300. If a random sample of 50 WLU MBAs and a random sample of 60 UWO MBAs are selected, what is the probability that the sample mean starting salary of WLU graduates will exceed that of the UWO graduates?

Solution What is a Sampling Distribution? We want to determine P( X 1 X 2 > 0). We know that X 1 X 2 is Normally distributed with mean µ = µ 1 µ 2 = 62000 60000 = 2000 and standard deviation σ σ1 2 = + σ2 2 14500 2 = + 183002 = 3128. n 1 n 2 50 60 P( X 1 X 2 > 0) = P( ( X 1 X 2 ) µ σ > 0 2000 3128 ) = P(Z > 0.64) = P(Z > 0.64) = 1 P(Z < 0.64) = 1 0.2611 = 0.7389 There is a 0.7389 probability that for a sample of size 50 from the WLU graduates and a sample of size 60 from the UWO graduates, the sample mean starting salary of WLU graduates will exceed the sample mean of UWO graduates.

Exercise 9.48 Suppose that we have two Normal populations with means and standard deviations listed here. If random samples of size 25 are drawn from each population, what is the probability that the mean of sample 1 is greater than the mean of sample 2? Population 1: µ = 40, σ 1 = 6 Population 2: µ = 38, σ 2 = 8

Solution What is a Sampling Distribution? We want to determine P( X 1 X 2 > 0). We know that X 1 X 2 is Normally distributed with mean µ = µ 1 µ 2 = 40 38 = 2 and standard deviation σ σ1 2 = + σ2 2 6 2 = n 1 n 2 25 + 82 25 = 2. P( X 1 X 2 > 0) = P( ( X 1 X 2 ) µ σ > 0 2 2 ) = P(Z > 1) = P(Z > 1) = 1 P(Z < 1) 1 0.16 = 0.84 or roughly 84%.