Class 16 Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science Copyright 013 by D.B. Rowe 1
Agenda: Recap Chapter 7. - 7.3 Lecture Chapter 8.1-8. Review Chapter 6. Problem Solving Session.
Recap Chapter 7.-7.3 3
7: Sample Variability 7. The Sampling Distribution of Sample Means Sample distribution of sample means (SDSM): If random samples of size n, are taken from ANY population with mean μ and standard deviation σ, then the SDSM: 1. A mean equal to μ. A standard deviation equal to n Central Limit Theorem (CLT): The sampling distribution of sample means will more closely resemble the normal distribution as the sample size n increases. 5
7: Sample Variability 7. The Sampling Distribution of Sample Means Eample: N=5 balls in bucket, select n=1 with replacement. 0,, 4, 6, 8. 0 4 6 8 P( ) 0 1/ 5 1/ 5 4 1/ 5 6 1/ 5 P() 0. 0.16 0.1 0.08 0.04 8 1/ 5 0 0 4 6 8 6
7: Sample Variability 7. The Sampling Distribution of Sample Means Eample: N=5 balls in bucket, select n= with replacement. P( ) 0,, 4, 6, 8. 0 4 6 8 0 1/ 5 1 / 5 3 / 5 3 4 / 5 4 5 / 5 5 4 / 5 6 3 / 5 7 / 5 8 1/ 5 P() 7
7: Sample Variability 7. The Sampling Distribution of Sample Means from n=1 distribution from n= distribution P() 0. 4, 4, / 0.16 0.1 0.08 0.04 0 0 4 6 8? from n=3 distribution Looks like? Figure from Johnson & Kuby, 01. from n=4 distribution 4, / 3 4, / 4? Looks like? n large? 4 Looks like? n 1
7: Sample Variability 7. The Sampling Distribution of Sample Means Uniform(0,00) Normal(100,(57.7) ) f( ) 1 00 f( ) e 1 100 57.7 57.7 0 μ=100 00 μ=100 100 57.7 100 57.7 13
7: Sample Variability 7. The Sampling Distribution of Sample Means Generate n million random observations,..., n 1 10 6 from the Uniform(a=0,b=00) and also from the Normal(μ=100,σ =(57.7) ) distributions, for each of n=1,, 3, 4, 5, and 15. 6 data sets of Uniform and Normal random observations 14
7: Sample Variability 7. The Sampling Distribution of Sample Means n=1 1 10 6 observations scale changes 10 104 9 8 7 100 57.73 10 104 9 8 7 100 57.73 6 5 observations are uniform 6 5 observations are normal 4 3 4 3 limited range 1 1 0 0 0 40 60 80 100 10 140 160 180 00 0 0 0 40 60 80 100 10 140 160 180 00 15
7: Sample Variability 7. The Sampling Distribution of Sample Means n= 10 6 observations scale changes 1.8 1.6 1.4 105 100 57.73 1.8 1.6 1.4 105 100 57.73 1. 1 observations are uniform 1. 1 observations are normal 0.8 0.6 0.8 0.6 limited range 0.4 0.4 0. 0. 0 0 0 40 60 80 100 10 140 160 180 00 0 0 0 40 60 80 100 10 140 160 180 00 16
7: Sample Variability 7. The Sampling Distribution of Sample Means n=3 3 10 6 observations scale changes.5 3 105 100 57.73.5 3 105 100 57.73 1.5 observations are uniform 1.5 observations are normal 1 1 limited range 0.5 0.5 0 0 0 40 60 80 100 10 140 160 180 00 0 0 0 40 60 80 100 10 140 160 180 00 17
7: Sample Variability 7. The Sampling Distribution of Sample Means n=4 4 10 6 observations scale changes 3.5 4 105 3 100 57.73 3.5 4 105 3 100 57.73.5 observations are uniform.5 observations are normal 1.5 1.5 limited range 1 1 0.5 0.5 0 0 0 40 60 80 100 10 140 160 180 00 0 0 0 40 60 80 100 10 140 160 180 00 18
7: Sample Variability 7. The Sampling Distribution of Sample Means n=5 5 10 6 observations scale changes 4.5 4 3.5 5 105 100 57.73 4.5 3.5 5 105 4 100 57.73 3.5 observations are uniform 3.5 observations are normal 1.5 1.5 limited range 1 1 0.5 0.5 0 0 0 40 60 80 100 10 140 160 180 00 0 0 0 40 60 80 100 10 140 160 180 00 19
7: Sample Variability 7. The Sampling Distribution of Sample Means n=15 15 10 6 observations scale changes 15 105 10 100 57.73 observations are uniform 15 105 10 100 57.73 observations are normal 5 5 limited range 0 0 0 40 60 80 100 10 140 160 180 00 0 0 0 40 60 80 100 10 140 160 180 00 0
7: Sample Variability 7. The Sampling Distribution of Sample Means Sample means and standard deviations from each of the n million observations from the Uniform(a=0,b=00) and Normal(μ=100,σ =(57.7) ) distributions. i.e. n=5, 5 10 6 Groups of n=5 Mean of groups 1 1,, 3, 4, 5 1 6 data sets of Uniform and Normal... 5000000 6, 7 8, 9, 10... 4999996,.,.,., 5000000... 10 6 s Histogram of 's 1
7: Sample Variability 7. The Sampling Distribution of Sample Means n=1 1 10 6 means scale same 100 57.73.5 105.5 105 Histogram of means from uniform Histogram of means from normal 1.5 1.5 1 s 1 s limited range 0.5 0.5 0 0 0 40 60 80 100 10 140 160 180 00 same classes 0 0 0 40 60 80 100 10 140 160 180 00 same classes
7: Sample Variability 7. The Sampling Distribution of Sample Means n= 1 10 6 means scale same 100 57.73.5 105.5 105 Histogram of means from uniform Histogram of means from normal 1.5 1.5 1 s 1 s limited range 0.5 0.5 0 0 0 40 60 80 100 10 140 160 180 00 same classes 0 0 0 40 60 80 100 10 140 160 180 00 same classes 3
7: Sample Variability 7. The Sampling Distribution of Sample Means n=3 1 10 6 means scale same 100 57.73.5 105.5 105 Histogram of means from uniform Histogram of means from normal 1.5 1.5 1 s 1 s limited range 0.5 0.5 0 0 0 40 60 80 100 10 140 160 180 00 same classes 0 0 0 40 60 80 100 10 140 160 180 00 same classes 4
7: Sample Variability 7. The Sampling Distribution of Sample Means n=4 1 10 6 means scale same 100 57.73.5 105 1.5 Histogram of means from uniform.5 105 1.5 Histogram of means from normal 1 1 s s 0.5 0.5 0 0 0 40 60 80 100 10 140 160 180 00 same classes 0 0 0 40 60 80 100 10 140 160 180 00 same classes 5
7: Sample Variability 7. The Sampling Distribution of Sample Means n=5 1 10 6 means scale same 100 57.73.5 105 1.5 Histogram of means from uniform.5 105 1.5 Histogram of means from normal 1 1 s s 0.5 0.5 0 0 0 40 60 80 100 10 140 160 180 00 same classes 0 0 0 40 60 80 100 10 140 160 180 00 same classes 6
7: Sample Variability 7. The Sampling Distribution of Sample Means n=15 1 10 6 means scale same 100 57.73.5 105 1.5 Histogram of means from uniform.5 105 1.5 Histogram of means from normal 1 1 s s 0.5 0.5 0 0 0 40 60 80 100 10 140 160 180 00 same classes 0 0 0 40 60 80 100 10 140 160 180 00 same classes 7
7: Sample Variability 7.3 Application of the Sampling Distribution of Sample Means Now that we believe that the mean from a sample of n=15 is normally distributed with mean and standard deviation n, we can find probabilities. a b b a P( a b) P( b ) P( a) 40
7: Sample Variability 7.3 Application of the Sampling Distribution of Sample Means To find these probabilities, we first convert to z scores a b b a P( a b) P( b ) P( c z d) same area P( d z) same area P( a) P( z c) same area z a b, c, d and use the table in book. 9
7: Sample Variability Questions? Homework: Chapter 7 # 6, 1, 3, 9, 33, 35 34
Lecture Chapter 8.1-8. 35
Chapter 8: Introduction to Statistical Inference Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science 36
8: Introduction to Statistical Inference 8.1 The Nature of Estimation The purpose of Statistical Inference is to use the info in a sample of data to increase knowledge of a population. Figure from Johnson & Kuby, 01. 37
8: Introduction to Statistical Inference 8.1 The Nature of Estimation We discussed how if we compute a quantity from a population of data then it is called a parameter and if we estimate it from a sample of data then it is called a statistic. Recall: Chapter 1 definitions. Parameter: A numerical value summarizing all the data of an entire population. Statistic: A numerical value summarizing the sample data. 38
8: Introduction to Statistical Inference 8.1 The Nature of Estimation More precisely, a single number to estimate a parameter is called a point estimate. Point estimate for a parameter: A single number designed to estimate a quantitative parameter of a population, usually the value of the corresponding sample statistic. i.e. is a point estimate for μ 39
8: Introduction to Statistical Inference 8.1 The Nature of Estimation Interval estimate: An interval bounded by two values and used to estimate the value of a population parameter. The values that bound this interval are statistics calculated from the sample that is being used as the basis for estimation. i.e. ( some amount) is an interval estimate for μ. The interval estimate will be of the form point estimate some amount 40
8: Introduction to Statistical Inference 8.1 The Nature of Estimation Interval Estimate for μ. Significance Level: Pre assigned probability of a parameter being outside our interval estimate, α. P( not in some amount) = i.e..05 1 P( some amount < < some amount) = 41
8: Introduction to Statistical Inference 8.1 The Nature of Estimation Take many samples and for each calculate interval estimate then. Level of Confidence 1-α: The proportion of all interval estimates that include the parameter being estimated. i.e μ P( in some amount) =1- P( some amount < < some amount) = 1 4
8: Introduction to Statistical Inference 8.1 The Nature of Estimation Confidence Interval: An interval estimate with a specified level of confidence. A range of values for the parameter with a level of confidence attached. (i.e. 95% confident) point estimator some amount(1- ) some amount that depends on The general form for a confidence interval is point estimate margin of error confidence level 1-43
8: Introduction to Statistical Inference 8. Estimation of Mean μ (σ Known) The assumption for estimating mean μ using a known σ: The sampling distribution of has a normal distribution. Recall from Chapter 6 that for the standard normal distribution, P( 1.96 z 1.96)= 0.95 From the CLT in Chapter 7, we know that when n is large, the sample mean is approimately normally distributed with, n 44
8: Introduction to Statistical Inference 8. Estimation of Mean μ (σ Known) What this implies is that has an approimate standard normal distribution! z P( 1.96 z 1.96)= 0.95 Or more generally, 1.95 P( z( / ) z z( / ) )= 1. z( / ) called the confidence coefficient. -z( / ) z( / ) -1.96 1.96 45
8: Introduction to Statistical Inference 8. Estimation of Mean μ (σ Known) P( z( / ) z z( / ) )= 1 With some algebra, we can see that. (fill in) 46
8: Introduction to Statistical Inference 8. Estimation of Mean μ (σ Known) Thus, a (1-α) 100% confidence interval for μ is σ σ - z( / ) + z( / ) n n which if α=0.05, a 95% confidence interval for μ is σ σ -1.96 + 1.96. n n Confidence Interval for Mean: σ z( / ) σ to z( / ) (8.1) n n 49
8: Introduction to Statistical Inference 8. Estimation of Mean μ (σ Known) THE CONFIDENCE INTERVAL: A FIVE STEP PROCESS Step 1 The Set-UP: Step Confidence Interval Criteria: Step 3 The Sample Evidence: Step 4 The Confidence Interval: Step 5 The Results: Your Book describes as a 5 step process. Read this. Important. 50
8: Introduction to Statistical Inference 8. Estimation of Mean μ (σ Known) Philosophically, μ is fied and the interval varies. If we take a sample of data,,..., 1 n and determine a confidence interval from it, we get. ± z( / ) σ n μ each sample a different interval of same width If we had a different sample of data, y,..., 1 yn we would have determined a different confidence interval. y ± z( / ) σ n Figure from Johnson & Kuby, 01. 51
8: Introduction to Statistical Inference 8. Estimation of Mean μ (σ Known) We never truly know if our CI from our sample of data will contain the true population mean μ. But we do know that there is a (1-α) 100% chance that a confidence interval from a sample of data will contain μ. 5
8: Introduction to Statistical Inference 8. Estimation of Mean μ (σ Known) Eample: Make 5 10 6 normal μ=100,σ=57.7 random values. 105 1.8 1.6 1.4 1. 1 True 100 57.7350 Random s 100.030 57.763 0.8 0.6 Sample mean and variance from the 5 million values. 0.4 0. 0-100 -50 0 50 100 150 00 50 300 Here is a histogram of the 5 million values. 53
8: Introduction to Statistical Inference 8. Estimation of Mean μ (σ Known) Form 10 6 10 104 s ' (means) from successive n=5 random numbers. 9 8 7 6 5 True 100 5.8199 Random s 100.030 5.8397 4 3 1 0 0 0 40 60 80 100 10 140 160 180 00 Sample mean and variance from the 1 million means., / Here is a histogram of the 1 million means. n 54
8: Introduction to Statistical Inference 8. Estimation of Mean μ (σ Known) 10 104 9 8 7 6 5 s ' Form 1 million U and L values from s '. n=5 and 57.7 U = + 1.96σ / n U = + 1.96σ / n 4 3 1 0 0 0 40 60 80 100 10 140 160 180 00 L = - 1.96σ / n Random, s ' L 49.450 U 150.6389 insert each L = - 1.96σ / n insert true We will also get 1 million L s and U s that we can use to make histograms. 56
8: Introduction to Statistical Inference 8. Estimation of Mean μ (σ Known) 100 57.7 Form 1 million U and L values from s '. L s U s U = + 1.96σ / n L = - 1.96σ / n True Random L 49.3930 U L 49.450 150.6070 150.6389 U Histogram of the 1 million L s and U s. 57
8: Introduction to Statistical Inference 8. Estimation of Mean μ (σ Known) 66.5 57.7 Form 1 million U and L values from s '. L s U s U = + 1.96σ / n L = - 1.96σ / n True Random L 49.3930 U L 49.450 150.6070 150.6389 U Histogram of the 1 million L s and U s..5% of time upper less than 100.5% of time lower greater than 100 5% of time μ=100 not in interval P( not in 1.96 )= n 58
8: Introduction to Statistical Inference 8. Estimation of Mean μ (σ Known) Eample: A random sample of size n=15 student heights was taken from this class. Assume that we know that σ=4. Construct a 95% confidence Interval for μ. σ σ - z( / ) to + z( / ) 68.7 4 n n.05 4 4 68.7 1.96 to 68.7 1.96 15 15 z(.05) 1.96 69,69,67,66,71,74,75,70,67,73,64,65,68,67,65 66.7 to 70.7 z(.05) 63
Chapter 8: Introduction to Statistical Inference Questions? Homework: Chapter 8 # 5, 15,, 4, 35, 47, 57, 59, 81, 93, 97, 106, 109, 119, 145, 157 64
Review Chapter 6 65
6: Normal Probability Distributions 6.1 Normal Probability Distributions The mathematical formula for the normal distribution is (p 69): f( ) e 1 where e =.7188188459046 π = 3.14159653589793 μ = population mean σ = population std. deviation f(), 0 We will not use this formula. Figure from Johnson & Kuby, 01. 66
6: Normal Probability Distributions 6. The Standard Normal Probability Distributions Eample: Here is a normal distribution with μ = 5 and σ = 4. f() σ= Let s say we want to know the red area under the normal distribution between 1 =.8 and = 9.8. μ= What is the area under the normal distribution between these two values? 67
6: Normal Probability Distributions 6. The Standard Normal Probability Distributions Eample: Here is a normal distribution with μ = 5 and σ = 4. z f() f(z) 1 =.8 = 9.8 σ= z 1 1 z σ=1 Area between 1 and is the same as area between z 1 and z. We find z 1 = ( 1 - μ)/σ = -1.36 and z = ( - μ)/σ=.14? μ=.8 5 9.8 5 μ= 68 z
6: Normal Probability Distributions 6. The Standard Normal Probability Distributions Standard normal curve μ = 0 and σ = 1. Now we can simply look up the z areas in a table. f(z) Appendi B Table 3 Page 716. σ=1 μ= z 69
First Decimal Place in z 6: Normal Probability Distributions Appendi B Table 3 Page 716 Second Decimal Place in z z 1 = -1.36 z =.14 70
6: Normal Probability Distributions Appendi B, Table 3, Page 716 z 1 = -1.36 z =.14 P(z<-1.36)=Area less than -1.36. We get this from Table 3. Row labeled -1.3 over to column Labeled.06. z 71
6: Normal Probability Distributions Appendi B, Table 3, Page 717 z 1 = -1.36 z =.14 (continued) P(z<.14)=Area less than.14. We get this from Table 3. Row labeled.1 over to column Labeled.04. z 7
6: Normal Probability Distributions Appendi B, Table 3, Page 716-717 z 1 = -1.36 z =.14 P(-1.36<z<.14) = P(z<.14) - P(z<-1.36) 0.8969 = 0.9838-0.0869 = - z Red Area=0.8969 z 73 z
6: Normal Probability Distributions 6. The Standard Normal Probability Distributions z 1 = -1.36 z =.14 Eample: Here is a normal distribution with μ = 5 and σ = 4. f() f(z) 0.8969 0.8969 Area between 1 and is same as the area between z 1 and z. z 74
6: Normal Probability Distributions 6.3 Applications of Normal Distributions Eample: Assume that IQ scores are normally distributed with a mean μ of 100 and a standard deviation σ of 16. If a person is picked at random, what is the probability that his or her IQ is between 100 and 115? i.e. P(100 115)? μ μ Figures from Johnson & Kuby, 01. 75
6: Normal Probability Distributions 6.3 Applications of Normal Distributions IQ scores normally distributed μ=100 and σ=16. P(100 115) z z 1 100 115 100 100 0 16 1 1 z 115 100 16 0.94 Figures from Johnson & Kuby, 01. 76
6: Normal Probability Distributions 6.3 Applications of Normal Distributions Now we can use the table. = - P(0 z 0.94) P( z 0.94) P( z 0) 0.864.5 0.364 Figures from Johnson & Kuby, 01. 77
6: Normal Probability Distributions 6.4 Notation Eample: Let α=0.05. Let s find z(0.05). P(z>z(0.05))=0.05. P(z>z(0.05)) Same as finding P(z<z(0.05))=1-0.05. z 0.95 0.05=P(z>z(0.05)) Figures from Johnson & Kuby, 01. 78
6: Normal Probability Distributions 6.4 Notation Eample: Same as finding P(z<z(0.05))=0.95. 1.645 Figures from Johnson & Kuby, 01. 79
6: Normal Probability Distributions 6.5 Normal Approimation of the Binomial Distribution Approimate binomial probabilities with normal areas. Use a normal with np, np(1 p) n=14 p=1/ (14)(.5) 7 (14)(.5)(1.5) 3.5 Figures from Johnson & Kuby, 01. 80
6: Normal Probability Distributions 6.5 Normal Approimation of the Binomial Distribution n=14, p=1/ We then approimate binomial probabilities with normal areas. P ( 4) from the binomial formula is approimately P(3.5 4.5) from the normal with 7, 3.5 the ±.5 is called a continuity correction Figures from Johnson & Kuby, 01. 81
6: Normal Probability Distributions 6.5 Normal Approimation of the Binomial Distribution From the binomial formula 14! P(4) (.5) (1.5) 4!(14 4)! P ( 4) 0.061 4 14 4 P( 1.87 z 1.34) 0.0594 P( 1.87 z 1.34) Pz ( 1.34) From the Normal Distribution P(3.5 4.5) z z 1 1 3.5 7-1.87 1.87 4.5 7 1.34 1.87 Pz n=14, p=1/ 7, 3.5 ( 1.87) 1.87 0.0594 0.0901 0.0307 8
6: Normal Probability Distributions Questions? Homework: Chapter 6 # 7, 9, 13, 17, 19, 9, 31, 33, 41, 45, 47, 53, 61, 75, 95, 99 83
Problem Solving Session 84