Normal distribution curve as probability distribution curve The normal distribution curve can be considered as a probability distribution curve for normally distributed variables. The area under the normal distribution curve is more important than the heights (or frequency values) of the curve. A given area corresponds to a probability. The probability of selecting a z value at random between 0 and.00 is the same as the area under the curve between 0 and.00. Notation: P (0 < z <.00) Example 1 Find the following probabilities. (a) P (0 < z <.31) P (0 < z <.31) = 0.4896 /// (b) P (z < 1.66) (c) P (z > 1.9) This is the area under the curve to the left of 1.66 which is P (z < 1.66) = 0.5000 + 0.4515 = 0.9515. /// This is the area to the right of 1.9 which is P (z > 1.9) = 0.5 P (0 < z < 1.9) = 0.5 0.476 = 0.074. /// Example Find the value of z such that the area under the normal distribution curve between 0 and the z value is 0.157. In this case, we work backward! Locate 0.157 on the table then read the corresponding z value in the left column and add to the number in the corresponding top row to give you the required z value. In this case, 0.5 + 0.07 = 0.57. /// Note: Use the closest value if the exact area cannot be found. Applications of the Normal Distribution c 004, RSHavea, MaCS, USP 1 File updated: September 13, 004
Aim: Find probabilities for a normally distributed variable by transforming it into a standard normal variable. Transformation formula value mean SD = X µ. σ Example 3 Suppose that the scores for a standardized test are normally distributed with mean of 100 and standard deviation of 15. Under the above transformation, the two distributions coincide. /// Example 4 If the scores of the test of 100 and a standard deviation of 15, find the percentage of scores that will fall below 11. The z value corresponding to a score of 11 is 11 100 15 Thus 11 is 0.8 standard deviation above the mean of 100. = 0.8. The total area is 0.5 + 0.881 = 0.7881. Hence 78.81% of the scores fall below 11. /// Example 5 Each month, an American household generates an average of 8 lbs of newspaper for garbage or recycling. Assume the standard deviation is lbs. If a household is selected at random, find the probability of its generating: (Assume that the variable is approximately normally distributed.) (a) Between 7 and 31 pounds per month. The total area is The two z values are z 1 = 7 8 = 0.5 and z = 0.1915 + 0.433 = 0.647. 31 8 = 1.5. Hence the required probability 6.47% /// c 004, RSHavea, MaCS, USP File updated: September 13, 004
(b) More than 30. pounds per month. The table gives Therefore, The z value for 30. is 30. 8 = 1.1. P (0 z 1.1) = 0.3643. P (z > 1.1) = 0.5 0.3643 = 0.1357. /// Finding data values given specific probabilities Aim: Find specific data values for given percentages using the standard normal distribution. When finding X, the following formula can be used: X = zσ + µ Example 6 In order to qualify for a police academy, candidates must score in the top 10% on a general abilities test. The test has a mean of 00 and a standard deviation of 0. Find the lowest score to qualify. Assume the test scores are normally distributed. We work backward to solve this problem The area under the curve between 00 and X is 0.5 0.1 = 0.4 From the table, look for the specific z value that corresponds to 0.4. We take the closest, 0.3997, which corresponds to 1.8. Then X = zσ + µ = (1.8)(0) + 00 = 6 Thus, anyone scoring 6 or more must be qualified. /// The Central Limit Theorem Distribution of sample mean: Suppose we take 100 samples of specific size from a large population. Computing the mean (of the same variable) for each of the 100 samples we get the sample means: More formally: X 1, X,..., X 100 c 004, RSHavea, MaCS, USP 3 File updated: September 13, 004
The sampling distribution of sample means is a distribution obtained by using the means computed from random samples of a specific size taken from a population. If the samples are randomly selected with replacement, the sample means (for most part) will be slightly different from the population mean µ. This is caused by sampling error. Sampling error is the difference between sample measure and the corresponding population measure due to the fact that the sample is not a perfect representation of the population. Properties of the distribution of sample means 1. The mean of the sample means will be the same as the population mean.. The standard deviation of the sample means will be smaller than the standard deviation of the population, and it will be equal to the population standard deviation divided by the square root of the sample size. Example 7 A professor gave an 8 points quiz to a class of 4 students. The results of the quiz were Assume that this is the whole population. Then, 6, 4, 8. µ = 5 and σ =.36 The following graph shows that we have a uniform distribution. The following table shows all sample of size taken with replacement with mean of each sample. Sample Mean Sample Mean, 6, 4,4 3 6,4 5,6 4 6,6 6,8 5 6,8 7 4, 3 8, 5 4,4 4 8,4 6 4,6 5 8,6 7 4,8 6 8,8 8 A frequency distribution and histogram of sample means: c 004, RSHavea, MaCS, USP 4 File updated: September 13, 004
The mean for the sample means is X f 1 3 4 3 5 4 6 3 7 8 1 µ X = + 3 + 4 + + 8 16 The standard deviation of the sample means is = 80 16 = 5 = µ σ X = 1.581 = σ. /// In summary, if all possible samples of size n are taken with replacement from the same population, µ X = µ and σ X = σ n. The standard deviation of the sample means, σ X, is called the standard error of the mean. c 004, RSHavea, MaCS, USP 5 File updated: September 13, 004