Objectives. 3.3 Toward statistical inference

Size: px
Start display at page:

Download "Objectives. 3.3 Toward statistical inference"

Transcription

1 Objectives 3.3 Toward statistical inference Poulation versus samle (CIS, Chater 6) Toward statistical inference Samling variability Further reading: htt://onlinestatbook.com/2/estimation/characteristics.html (some of the concets introduced in this link are beyond this class) Adated from authors slides 2012 W.H. Freeman and Comany

2 Amazon reviews

3 Which tracker should I buy. How to comare them? The smart Lintelek has the best average rating of 4.8 stars. But the samle size for the smart Lintelek is on 58. The other two trackers, have worse reviews (average 4.2 and 4.4), but their samle sizes are substantially larger, 174 and 261. How can we comare these numbers? If more eole are asked, the averages will change. Ideally, we could comare the true averages (mean based on the oulation). This way we can definitively say that eole refer one roduct to another based on their mean.

4 Refresher: Definitions Poulation: The entire grou of individuals in which we are interested but cannot assess or observe directly. Samle: The art of the oulation we actually examine and for which we do have data. Examles: consumers who have urchased the roduct. Examles: All online consumers. A arameter is a number describing a characteristic of the oulation. Usually it is the mean, such as the mean rating of a roduct. Poulation Samle A statistic is a number describing a characteristic of a samle. Usually it is the samle average such as an amazon rating.

5 Samle 1: Samling variability The histogram is a lot of the roortion of ratings given for a articular tracking watch. The mean score is 4.2 Here we samle 10 eole and ask them to rate the watch. We see that for this samle all 10 individuals rate the watch 5. Giving a mean of 5! Which is not the same as 4.2.

6 Samle 2: Samling variability The histogram is a lot of the roortion of ratings given for a articular tracking watch. The mean score is 4.2 Here we samle 10 eole and ask all of then to rate the watch. In this samle, 2 eole give a rating of 1, which dros the average to 3.9.

7 Samling variability As illustrated in the revious examle, for every samle taken from a oulation, we are likely to get a different set of individuals and calculate a different value for our statistic (such as the samle mean). This is called samling variability. This would suggest that the samle and the statistic contains no information about the oulation. However. The good news is that, if we imagine taking lots of random samles of the same size from a given oulation, the variation from samle to samle the samling distribution will follow a redictable attern. All of statistical inference is based on this; to see how trustworthy a statistic is what haens of we ket reeating the samling many times?

8 We measure the quality of a statistic (such as the samle mean) with: Accuracy (bias) Random samles rovide accurate estimates of a arameter because they are unbiased (or close to unbiased, deending on the random samling method). This is done by samling in a good way (ie. Randomly samling over the oulation of interest). Tyically we will assume an estimator is unbiased. When reading an article identify the oulation of interest and otentially biases which may arise. Reliability (variable) A reliable estimation method is one that would give similar results if the random samling is reeated over. The less variable a statistic, the more reliable it is. Random samling enables us to measure the variability of a statistic. We do this with the standard error in the next slide we define what this means. Imortant: The larger the samle size, the less variable the corresonding statistic will be. To understand the above concets look at the question at the end of this age: htt://onlinestatbook.com/2/estimation/characteristics.html

9 Measuring Variability We have come across variability before. Recall in Chater 3 we used the standard deviation to measure the variability in the samle. We recall that the samle standard deviation is the deviation from each observation to the samle mean: s = v u t 1 n 1 nx (X i X) 2 i=1 q The same criterion is used to measure the variability in the samle mean (and all other estimators). This is called the standard error. q More recisely, we measure the average sread from each estimator to the true mean. q This sounds imossible. q Remarkably, we can find a very nice exression for the standard error which requires very little effort!

10 What do you mean?? Variability of samle mean is a very strange notion. But let us do a thought exeriment with the aid of Statcrunch. We ask 10 eole to review the Lintelek tracker. The average score amongst these 10 eole eole is 5. We ask another 10 eole to review the Lintelek tracker. The average score amongst these 10 eole eole is 3.9. We ask another 10 eole to review the Lintelek tracker. The average score amongst these 10 eole eole is 4.3. Each of these averages are different, they vary (just like an individual score varies from erson to erson), but because it is an average the variation is less than an individual score. The variation amongst these three averages is r ((5 4.4)2 +( ) 2 +( ) 2 )=0.56 In reality we want to calculate the standard deviation of all ossible averages, this is called the standard error.

11 Poulation size does not matter Question: Would the estimate of the roortion be better if the oulation size were smaller? For examle, 1.5 million students rather than 15 million student. Answer: No. Only the size of the samle, in this case n=2000, has an influence on it s reliability, not the size of the oulation. Statistical inference is not based on how close the samle size is to the oulation (usually we assume that the oulation is infinite). It is based on the idea that simle random samle gives a reresentative samle over the entire oulation.

12 Summary and what s to come The techniques of statistics allow us to draw inference or conclusions about a oulation using the data from a samle. Your estimate of the oulation arameter is only as good as your samling design. à Work hard to eliminate biases (design your exeriment well). Your samle statistic is only an estimate and if you randomly samled again you would robably get a somewhat different result (more of this next). In the next section we will show: q q The distribution of the estimates (for much of the course it will be the samle mean) will (if the samle size is large enough) be normally distributed even if the observations are not normal. The standard error (reliability) has a simle formula!

13 Objectives 5.1 Samling distribution of a samle mean (CIS, Chater 8) The mean and standard deviation of For normally distributed oulations x The central limit theorem (CIS, Chater 8 and 103) Additional reading: htt://onlinestatbook.com/2/samling_distributions/sam_dist_mean.ht ml Adated from authors slides 2012 W.H. Freeman and Comany

14 Toic: Behavior of samle mean Learning objectives: Understand how the samling a in Statcrunch works and why we use it. Understand what drawing a samle from a distribution means, and what samle size means. Understand that the distribution (density curve) of the data can have many different shaes. Samle size cannot change the shae. However, if you take the average of this samle and were able to draw several samles and take the average of each, then the distribution of the average: Will have less sread (variability) then the distribution of the data. Will be more symmetric than the distribution of the data. Will, for large samle sizes, be close to normal (how large deends on how close the original data is to normal). The sread of the samle mean is called the standard error and can be calculated with the formula σ/ n. You will need to use this a later, to check the reliability of the data analysis.

15 Simulation tools used To demonstrate the concets, I will be using an Alet in Statcrunch called samling distribution. It is highly recommended that you try this out yourself. Alets -> Samling Distributions. Select the distribution (from uniform etc) or choose the data table (your own data). Press comute. Choose your samle size (this is how large a samle you use). You will see a 1000 button, this has NOTHING to do with samle size. It is the number of samles you draw. I usually set it to 60K. Press the + sign next to Samling means to get the QQlot of the distribution of the samle mean. Do not ress the + sign next to Samles this will give you the QQlot of the samle. The ideas are rather sohisticated and it will take time to understand them. Note that you can customize the (arent) distribution from which you samle from by simly left clicking over the arent distribution and moving the cursor as you want the shae of the distribution to be.

16 Proortions As a thought exeriment we treat the oulation scores for the rating of the tracker as known. In reality this is never known. Suose 70% of the oulation would rate is 5 stars. 10% 4 stars 3% 3 stars 6% 2 stars 11% 1 star. In the following examles eole from this distribution will be drawn. This will form the samles.

17 Distribution of average: samle 5 Let us now look at the distribution of the samle mean of all samles of size 5. That is we samle 5 eole the oulation ask their ratings and evaluate the samle mean. The true roortions A samle of 5 eole who were questioned and their scores. The average rating of these 5 scores is 4. We do the same thing many, many (36K) times. Every average score out of 5 eole is lotted in this histogram. We have made a sread sheet of 36K averages.

18 QQlot of average: samle 5 Because we have a whole load of averages (we made 36K of them!) we can make a QQlot of these samle means, of all samles of size 5 (corresonding to the histogram on the revious age). Observations: The histogram of the samle mean is more bell-shaed than the original distribution. However, it is certainly not normal. But there is less sread in the distribution of the averages than the original histogram. The QQlot shows a large deviation from normality in the tails.

19 Distribution of average: samle 20 Let us now look at the distribution of the samle mean of all samles of size 20. That is we samle 20 eole from the oulation ask their rating, and take the samle mean. The true roortions A samle of 20 eole who were questioned and their scores. The average rating of these 20 scores is We do the same thing many, many (60K) times. Every average score out of 20 eole is lotted in this histogram.

20 QQlot of average: samle 20 Let us now look at the QQlot of the samle mean of all samles of size 20 (corresonding to the histogram on the revious age) Observations: 1. The histogram of the samle mean is a lot more bell-shaed than the original distribution. The sikes that were seen for samle size 5 have gone (the bums you see on the histogram are due to binwidth choice). 1. There is even less sread in the distribution of the averages than the original histogram. 2. The QQlot shows a deviation from normality in the to tail of the distribution.

21 1. The QQlot shows a some deviation from normality in the to tail of the distribution. 2. This deviation is because the samle mean cannot be greater than 5 (since that is the maximum score), whereas observations from a normal distribution can be.

22 Distribution of average: samle 50 Let us now look at the distribution of the samle mean of all samles of size 50. That is we samle 50 eole from the oulation ask their rating, and take the samle mean. The true roortions A samle of 50 eole who were questioned and their scores. The average rating of these 50 scores is We do the same thing many, many (60K) times. All the average scores are lotted in this histogram.

23 QQlot of average: samle 50 Let us now look at the QQlot of the samle mean of all samles of size 50 (corresonding to the histogram on the revious age) Observations: 1. The histogram of the samle mean is retty much normal. 2. There is even less sread in the distribution of the averages than the original histogram. 3. The QQlot shows only a very tiny deviation from normality in the tails of the distribution. This small deviation is because of the skewness in the original data.

24 Summary: Different samle sizes

25 q Take home message Proortions for the ratings is definitely not normal. q The ratings that each individual gives is numerical discrete (1,2,3,4 or 5). q The histogram shows that it is left skewed, with a large roortion at 5 stars. Standard deviation is σ= q q q If 5 eole are interviewed and the average evaluated, the distribution of the average rating is more symmetric, with less sread 0.614=1.336/ 5. We call the standard error for the samle mean of size 5. If 50 eole are interviewed and the average evaluated, the distribution of the average rating is close to normal with far less sread 0.19=1.336/ 50. We call 0.19 the standard error for the samle mean of size 50. The sread of the samle means decreases as the samle size increases.

26 Formula for standard error If the standard deviation of the original data is The formula for the standard error of the samle mean (standard deviation of samle mean) when the samle size is n is n

27 q How does this hel when choosing a tracker watch? The sread between the samle means decrease as the number in the samle increases. The imlication is that we are more and more likely to close in to the mean as the samle size grows. How to quantify more likely. Usually this would be imossible. But for large samle sizes, the samle mean is close to a normal distribution, centered about the true mean roortion (4.2 in our running examle). In chater 4 we learnt if a variable is normally distributed, then the roortion of the oulation that is within 1.96 standard deviations of the mean is 95%.

28 Alying this result: Near normality of the samle mean imlies that 95% of of the samle means will be within 1.96 standard errors of the oulation mean. The standard error is 1.336/ 50 = If the oulation mean rating of a roduct is 4.2, and 50 consumers are asked to rate the roduct the roortion of average ratings (amongst 50 consumers) that will be within 1.96x1.336/ 50 = 1.96x0.19 = 0.37 of 4.2 is 95%. In reality we do not know the mean rating of a roduct. But now that we understand the roerties of the samle mean, we can use these results to locate the oulation mean (we do this in Chater 6).

29 Proerties: Samle mean for normally distributed data When a variable in a oulation is normally distributed, the samling distribution of distributed. x for all ossible samles of size n is also normally If the oulation is Normal(µ, s) then the samle mean s Samling distribution distribution is Normal(µ, s/ n). Note that the samle average has less variability than any Poulation individual observation.

30 Averages: skewed distributions The salaries of NBL layers is numerical continuous, over 60% earn less than 5million dollars, but a small ercentage earn more than 20million. It is clear that this data is not normal (the QQlot is very S shaed). On the next slide we look at the distribution of the samle means of NBL salaries based on different samle sizes.

31 Proerties: Samle mean of non-normal distributed data Central Limit Theorem: When randomly samling from any oulation with mean μ and standard deviation σ, if n is large enough then the samling distribution of is aroximately normal: ~ N(μ, σ / n). x Samling distribution of x for n = 50 observations Samling distribution of x for n = 500 observations

32 The sread of the samle decreases as the samle size increases. It decreases according to the formula: σ/ n (this is called the standard error). The more skewed the original data, the larger the samle size required for the samle mean to be close to normal.

33 Question Time The distribution of heights has standard deviation 3. A samle of one erson is drawn. What is the standard error for the average based on one? (A) 3/1=3 (B) 3/2=1.5 (C) It is unknown. htt:// A samle of 5 eole is taken. What is the sread (standard error) of the average of 5 heights? (A) 3 (B) 3/ 5 = 1.34 (C) 3/5 = 0.6 htt://

34 Calculations

35 Toic: Calculations Learning targets: Be able to calculate robabilities for averages based on the average being close to normally distributed. Be able to assess is the samle average is close to normal based on the histogram of the data and the samle size. Be able to assess if the robabilities are accurate/correct based on how close the average is to normal. Be able to do sum tye calculations.

36 Normal data: Calculation Practice 1 In 2010 the combined SAT scores had mean 1016 and standard deviation 212. They also had aroximately normal distribution. Poulation distribution is Normal(μ = 1016; σ = 212). In Chater 4, we used the normal distribution to show that the robability of a randomly selected student scoring 1100 or higher is 34.5%. Now, suose 50 students are randomly selected and their SAT scores averaged. What is the robability that the average is greater than 1100? Samling distribution of the samle average when n = 50 is Normal(μ = 1016; σ / n = 212 / 50 = 29.98). Using these values, the z-score for 1100 is z ( x -µ) = = ` = = s n In Table A, the area to the right of 2.80 is So there is only a 0.25% chance that the average of 50 randomly samled students is more than In this examle we do not use the CLT because the original data is assumed normal.

37 On the left is the distribution of the SAT score for one erson. The mean grade is And we see that there is a lot of variation. So 34.6% of students score more than 1100 in their SATs. On the left is the distribution for the average SAT grade over 50 students. This the average grade amongst a class of 50. The x-axis of both lots have been aligned. The average grade over 50 is clustered closer to the global average. There is less variation. The roortion of classes with an average grade over 1100 is very small; only 0.25%. I urosely aligned the x-axis in both lots!

38 Normal data: Calculation Practice 2 Hyokalemia is diagnosed when mean blood otassium levels are below 3.5mEq/dl. Suose a diagnoses is made based on the blood samle. If the blood samle gives a otassium level less than 3.5 a diagnoses is made (later we exlain this is a wrong strategy and gives rise to too many false ositibes). Let s assume that we know a atient whose measured otassium levels vary daily according to the Normal(μ = 3.8, σ = 0.2) distribution (this erson by definition does not have low otassium since μ = 3.8 > 3.5). If only one measurement is made, what is the robability that this atient will be misdiagnosed with Hyokalemia? z ( x -µ) = = =-1.5 s 0.2 P(z < 1.5) = %.

39 Normal data: Calculation Practice 2 Instead, suose measurements are taken on 4 searate days, and the average evaluated. If the average is less than 3.5, low otassium is diagnosed. For the same atient, is the robability of a misdiagnosis? z ( x -µ) = = =-3 s n P(z < 3) = %.

40 Question Time Female heights are normally distributed with mean 65 inches and standard deviation 2 inches. The average of 9 (randomly) samled females is taken. What is the chance that their average height will be less than 63 inches? (A) 99.87% (B) 0.13% (C) 3% (D) 15.8% htt://

41 Non-normal data: Calculation Practice In Chater 4 we discussed ACT scores. We argued that because the grades were numerical discrete over a small range the grade distribution could not be normally distributed. BUT if the samle size is large enough the average will be close to normal. We recall the mean ACT score is 22 with standard deviation 5. Question: 50 students are randomly selected and the average taken. Calculate the roortion of averages which are greater than 20. Answer: The mean of the samle mean has the same mean as the original distribution, which we know is 22. The standard error of the samle mean is 5/ 50 = On the left we have the histogram of ACT scores on the right we have the histogram of the average based on 50 uils. The histogram of the average looks more normal.

42 Answer: The mean of the samle mean has the same mean as the original distribution, which we know is 22. The standard error of the samle mean is 5/ 50 = We use this to make the z-transform z = = 2.82 Looking u the z-tables using a comuter we see that robability is 99.7%. This means there is a very large chance the samle mean of a class of size 50 is greater is than 20.

43 Nonnormal data: Calculation Practice 2 Let us return to the weights of calves at 0.5 weeks. q Looking at the lot, it seems that a normal density (with mean and standard deviation 7.7) is a rough aroximation of the underlying distribution of calves weights (see also the QQlot given at the end of Chater 4). q q Question 1: Using the normal density calculate the roortion of calves that weight more than 100 ounds. Answer: Make a z-transform=( )/7.7 =1.28. This corresonds to 90% in the z-tables. Therefore, if the calf weights follow a normal distribution 10% of 0.5 week year calves will weight more than 100 ounds.

44 Question (b): Let us suose that the samle mean of 10 calves is taken. From the histogram of the samle mean, we see that it is close to normal. Using the normal distribution, what roortion of the samle means (based on samles of size 10) will be greater than 100 ounds? Answer: The mean of the samle mean is the same as the mean weight of cows which is The standard error of the samle mean is 7.7/ 10 = 2.4. By making the z-transform we have z=( )/2.4 = Looking u the z-tables, we see that it is in the far uer tails; the robability is close to 0%.

45 Concetual understanding q Of the two calf robabilities calculated above, which is likely to be closest to the roortion calculated using the correct distribution? q q Both robabilities were calculated using the normal distribution. But this is only an aroximation of the true distribution of calf weights and samle mean of calf weights. From the histogram of calf weights we see that is only aroximately normal. This means it is unlikely that the roortion calculated for the weight of one calf is that recise. On the other hand the Central Limit Theorem tells is that the distribution of the samle mean gets closer to normal as the samle size grows. The second roortion we calculated was based on the average weight of 10 calves. The distribution of the average is closer to normal than the distribution of one calf. Thus the second roortion will be closer to the roortion corresonding the distribution of the average.

46 Question Time (windchill 1) Wind chill is the erceived decrease in air temerature felt by the body due to wind. The mean is -28 and the standard deviation is 36. Assuming normality of the average, calculate the chance the average wind chill factor over 3 consecutive days will be more than zero. htt:// A. 77.7% B. 78.1% C. 8.9% D. 91.1%

47 Question Time (windchill 2) Base on the lot and QQlot of the data, is the robability calculated on the revious slide close to the true robability calculated if one were able to obtain the histogram of all averages of samle size 3? A. Yes, because the average will be normal. B. No, because the data is (thick tailed), clearly not normal. So an average based on 3 is not enough to invoke the CLT. htt://

48 Windchill average based on 3 We draw many samles each of size 3 from the windchill data and evaluate the average for each samle. A QQlot of these averages is given below. We see that the average based on three is still not normally, and there is a large deviation in the tails, esecially larger than 0 (for the samle mean). This means the robability calculated two slides back, which is based on the normal distribution will not be close to the true robability (calculated from all averages).

49 Windchill average based on 3 The lack of normality means the robability calculated two slides back, which is based on the normal distribution, will not be close to the true robability (calculated from all averages). The to gives the true histogram of the averages the bottom gives the normal aroximation. We see the shaes do not match.

50 Windchill average based on 30 We draw many samles each of size 30 from the windchill data and evaluate the average for each samle. A QQlot of these averages is given below. We see that the average based on 30 is very close to normal. This means the roblems which involve the average over 30 days and robabilities calculated using the normal distribution are likely to accurate.

51 Alication to sum roblems

52 Sums: Calculation ractice A farmer wants to use a vehicle to carry week old calves. The vehicle he lans to use can carry a maximum load of 2760 ounds. He knows that the mean weight of a calf is ounds and the standard deviation is 7.7. What is the chance the vehicle can carry the calves? We need to transform the total weight into an average (samle mean). We observe, if the total weight of 30 calves needs to be less than 2760 ounds this is the same as the samle mean weight of 30 calves must be less than 2760/30 = 92: X30 i=1 X i < 2760 ) X = 1 30 X30 i=1 X i < Therefore, we have turned the roblem from totals into averages and aly the CLT to calculate the robability using the normal distribution.

53 Calculation ractice (cont) We know from the central limit theorem that the samle mean is close to normally distributed. Thus the distribution of the samle mean is normal with mean and standard deviation 7.7/ 30 = 1.4. We know that for the vehicle to carry the calves, the samle mean has to be less than 92 ounds. Calculate the z-transform z=( )/1.4 = 1.35 and look u the z- tables to get Conclusion: 91.1% of the time the vehicle the will be carrying 30 calves legally. P X30 i=1 X i < 2760! = P X = 1 30 X30 i=1! X i < = P (Z <1.35) = 0.911

54 Question Time Suose the mean number of cans of lemonade consumed er erson at a arty is 3 and the standard deviation is one. A host is lanning a arty and is trying decide how many cans of lemonade to urchase for the arty. Suose 200 eole attend the arty. Using the normal distribution (since averages based on a samle size of 200 may be close to normal), what is the chance/robability that they will need to urchase more than 700 cans of lemonade? (A) 0% (B) 0.9% (C) 99.1% (D) 100% htt://

55 How large is a large enough samle size? It deends on the oulation distribution. More observations are required if the oulation distribution has a large standard deviation or if it is far from normal in distribution. A samle size of 25 is generally enough to obtain a normal samling distribution from a oulation with some skewness or even mild outliers. A samle size of 40 will tyically be good enough to overcome some skewness and outliers. More imortantly, n should be large enough to make the standard error sufficiently small then we can get meaningful and recise inferences. We can check this by using the Samling distribution alet. In many cases, even n = 40 is not large enough to give results reliable enough when there is a lot at stake. This is why clinical trials, olitical olls and marketing surveys tyically observe 100 s or even 1000 s of individuals.

56 The effect of skewness on the CLT Below we look at the samle mean taken from data with a large right skew This distribution is clearly right skewed. This is is the distribution of the average of 20 observations drawn from the above right skewed distribution. We see that the means of both distributions are aligned (as we would exect). The distribution of the average is less skewed, but it still is skewed.

57 The corresonding QQlot of the samle mean Observations: 1. The QQlot deviates from normality in the tables, esecially in the tails. The distribution of the samle mean still has a slight right skew (look back at the QQlots in Chater 4). This demonstrates that when data is highly skewed, we need a much large samle size for the CLT to kick in. 2. Calculations based on normality of the the average will not be comletely correct.

58 Effect of binary data on the CLT Binary data arises in several situations where ever there are only two ossible choices eg. Like or Dislike. In this examle, we have encryted one outcome with zero and the other with 1 (it does not really matter which way). We see that the roortion in the one category is about 20% - this is what is meant by the mean. This data is discrete and clearly skewed.

59 The corresonding QQlot of the samle mean Observations: 1. We see that the standard error is = 0.405/ 50, which is as it should be. 2. However, the QQlot deviates far from normality in the tables. The lines across demonstrate that the average over 50 still takes discrete values (though not integers). We also see a U shae that shows that the samle mean is still skewed. 3. Calculations based on normality of the the average will not be comletely correct.

60 Question Time Let s consider the very large database of individual incomes from the Bureau of Labor Statistics as our oulation. Income is strongly right skewed. Which histogram corresonds to samles of size We take 1000 SRSs of 25 incomes, calculate the samle mean for each, and make a histogram of these 1000 means. We also take 1000 SRSs of 100 incomes, calculate the samle mean for each, and make a histogram of these 1000 means. 100? Which to samles of size 25? (A) Left = samle size 25, Right = samle size 100 (B) Left = samle size 100. Right = samle size 25. htt://

61 So many standard deviations! In statistics we talk about different kinds of standard deviations, and it can be hard to kee track of them: s is the standard deviation of a set (samle) of data. It is a statistic we can comute once we have the data. σ is the standard deviation of a oulation (which is much too big to observe comletely). It is a arameter usually, we will never know its true value. σ /Ön is the standard deviation of the values of from all ossible random samles of size n. It refers to the samle mean, not to data. It is also called the standard error of. s /Ön is our estimate of σ /Ön, since we do not know the value of σ. From a survey of students taking statistics, n = 459 resonded to the question How many Facebook friends do you have? The samle mean was x = and the samle standard deviation was s = The standard error for the samle mean is s /Ön = 589.5/Ö459 = x is an estimate for μ = mean of the oulation of all students required to take the class and s is an estimate for the oulation standard deviation σ. x x

62 Summary is always unbiased for μ, even if the oulation s distribution is very different from a normal distribution. The standard deviation of, σ / n, measures the variability due to random samling. If the oulation is aroximately normal or if the samle size n is large, we can use the normal distribution to comute robabilities for. We just have to remember to use σ / n, not σ, in the denominator when calculating z. This means we can say something about how close is likely to be x x x to μ. Generally it is quite likely (95% chance) that it will be within 2 standard errors of μ. Not all variables are normally distributed and large samles are not always attainable. In such circumstances, a statistician should be consulted for roer methods of statistical inference and calculation. x

63 Accomanying roblems associated with this Chater Quiz 5 Quiz 6 Homework 2, Q6. Homework 3.

Objectives. 5.2, 8.1 Inference for a single proportion. Categorical data from a simple random sample. Binomial distribution

Objectives. 5.2, 8.1 Inference for a single proportion. Categorical data from a simple random sample. Binomial distribution Objectives 5.2, 8.1 Inference for a single roortion Categorical data from a simle random samle Binomial distribution Samling distribution of the samle roortion Significance test for a single roortion Large-samle

More information

and their probabilities p

and their probabilities p AP Statistics Ch. 6 Notes Random Variables A variable is any characteristic of an individual (remember that individuals are the objects described by a data set and may be eole, animals, or things). Variables

More information

***SECTION 7.1*** Discrete and Continuous Random Variables

***SECTION 7.1*** Discrete and Continuous Random Variables ***SECTION 7.*** Discrete and Continuous Random Variables Samle saces need not consist of numbers; tossing coins yields H s and T s. However, in statistics we are most often interested in numerical outcomes

More information

Lecture 2. Main Topics: (Part II) Chapter 2 (2-7), Chapter 3. Bayes Theorem: Let A, B be two events, then. The probabilities P ( B), probability of B.

Lecture 2. Main Topics: (Part II) Chapter 2 (2-7), Chapter 3. Bayes Theorem: Let A, B be two events, then. The probabilities P ( B), probability of B. STT315, Section 701, Summer 006 Lecture (Part II) Main Toics: Chater (-7), Chater 3. Bayes Theorem: Let A, B be two events, then B A) = A B) B) A B) B) + A B) B) The robabilities P ( B), B) are called

More information

Policyholder Outcome Death Disability Neither Payout, x 10,000 5, ,000

Policyholder Outcome Death Disability Neither Payout, x 10,000 5, ,000 Two tyes of Random Variables: ) Discrete random variable has a finite number of distinct outcomes Examle: Number of books this term. ) Continuous random variable can take on any numerical value within

More information

Sampling Procedure for Performance-Based Road Maintenance Evaluations

Sampling Procedure for Performance-Based Road Maintenance Evaluations Samling Procedure for Performance-Based Road Maintenance Evaluations Jesus M. de la Garza, Juan C. Piñero, and Mehmet E. Ozbek Maintaining the road infrastructure at a high level of condition with generally

More information

Non-Inferiority Tests for the Ratio of Two Correlated Proportions

Non-Inferiority Tests for the Ratio of Two Correlated Proportions Chater 161 Non-Inferiority Tests for the Ratio of Two Correlated Proortions Introduction This module comutes ower and samle size for non-inferiority tests of the ratio in which two dichotomous resonses

More information

A random variable X is a function that assigns (real) numbers to the elements of the sample space S of a random experiment.

A random variable X is a function that assigns (real) numbers to the elements of the sample space S of a random experiment. RANDOM VARIABLES and PROBABILITY DISTRIBUTIONS A random variable X is a function that assigns (real) numbers to the elements of the samle sace S of a random exeriment. The value sace V of a random variable

More information

SINGLE SAMPLING PLAN FOR VARIABLES UNDER MEASUREMENT ERROR FOR NON-NORMAL DISTRIBUTION

SINGLE SAMPLING PLAN FOR VARIABLES UNDER MEASUREMENT ERROR FOR NON-NORMAL DISTRIBUTION ISSN -58 (Paer) ISSN 5-5 (Online) Vol., No.9, SINGLE SAMPLING PLAN FOR VARIABLES UNDER MEASUREMENT ERROR FOR NON-NORMAL DISTRIBUTION Dr. ketki kulkarni Jayee University of Engineering and Technology Guna

More information

INDEX NUMBERS. Introduction

INDEX NUMBERS. Introduction INDEX NUMBERS Introduction Index numbers are the indicators which reflect changes over a secified eriod of time in rices of different commodities industrial roduction (iii) sales (iv) imorts and exorts

More information

Ordering a deck of cards... Lecture 3: Binomial Distribution. Example. Permutations & Combinations

Ordering a deck of cards... Lecture 3: Binomial Distribution. Example. Permutations & Combinations Ordering a dec of cards... Lecture 3: Binomial Distribution Sta 111 Colin Rundel May 16, 2014 If you have ever shuffled a dec of cards you have done something no one else has ever done before or will ever

More information

Sampling Distributions For Counts and Proportions

Sampling Distributions For Counts and Proportions Sampling Distributions For Counts and Proportions IPS Chapter 5.1 2009 W. H. Freeman and Company Objectives (IPS Chapter 5.1) Sampling distributions for counts and proportions Binomial distributions for

More information

6.1, 7.1 Estimating with confidence (CIS: Chapter 10)

6.1, 7.1 Estimating with confidence (CIS: Chapter 10) Objectives 6.1, 7.1 Estimating with confidence (CIS: Chapter 10) Statistical confidence (CIS gives a good explanation of a 95% CI) Confidence intervals Choosing the sample size t distributions One-sample

More information

LECTURE NOTES ON MICROECONOMICS

LECTURE NOTES ON MICROECONOMICS LECTURE NOTES ON MCROECONOMCS ANALYZNG MARKETS WTH BASC CALCULUS William M. Boal Part : Consumers and demand Chater 5: Demand Section 5.: ndividual demand functions Determinants of choice. As noted in

More information

Confidence Intervals for a Proportion Using Inverse Sampling when the Data is Subject to False-positive Misclassification

Confidence Intervals for a Proportion Using Inverse Sampling when the Data is Subject to False-positive Misclassification Journal of Data Science 13(015), 63-636 Confidence Intervals for a Proortion Using Inverse Samling when the Data is Subject to False-ositive Misclassification Kent Riggs 1 1 Deartment of Mathematics and

More information

Lecture 6: Chapter 6

Lecture 6: Chapter 6 Lecture 6: Chapter 6 C C Moxley UAB Mathematics 3 October 16 6.1 Continuous Probability Distributions Last week, we discussed the binomial probability distribution, which was discrete. 6.1 Continuous Probability

More information

Module 4: Probability

Module 4: Probability Module 4: Probability 1 / 22 Probability concepts in statistical inference Probability is a way of quantifying uncertainty associated with random events and is the basis for statistical inference. Inference

More information

Supplemental Material: Buyer-Optimal Learning and Monopoly Pricing

Supplemental Material: Buyer-Optimal Learning and Monopoly Pricing Sulemental Material: Buyer-Otimal Learning and Monooly Pricing Anne-Katrin Roesler and Balázs Szentes February 3, 207 The goal of this note is to characterize buyer-otimal outcomes with minimal learning

More information

Central Limit Theorem

Central Limit Theorem Central Limit Theorem Lots of Samples 1 Homework Read Sec 6-5. Discussion Question pg 329 Do Ex 6-5 8-15 2 Objective Use the Central Limit Theorem to solve problems involving sample means 3 Sample Means

More information

DO NOT POST THESE ANSWERS ONLINE BFW Publishers 2014

DO NOT POST THESE ANSWERS ONLINE BFW Publishers 2014 Section 7.2 Check Your Understanding, age 445: 1. The mean of the samling distribution of ˆ is equal to the oulation roortion. In this case µ ˆ = = 0.75. 2. The standard deviation of the samling distribution

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Review of previous lecture: Why confidence intervals? Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Suppose you want to know the

More information

Chapter 7 Sampling Distributions and Point Estimation of Parameters

Chapter 7 Sampling Distributions and Point Estimation of Parameters Chapter 7 Sampling Distributions and Point Estimation of Parameters Part 1: Sampling Distributions, the Central Limit Theorem, Point Estimation & Estimators Sections 7-1 to 7-2 1 / 25 Statistical Inferences

More information

Asian Economic and Financial Review A MODEL FOR ESTIMATING THE DISTRIBUTION OF FUTURE POPULATION. Ben David Nissim.

Asian Economic and Financial Review A MODEL FOR ESTIMATING THE DISTRIBUTION OF FUTURE POPULATION. Ben David Nissim. Asian Economic and Financial Review journal homeage: htt://www.aessweb.com/journals/5 A MODEL FOR ESTIMATING THE DISTRIBUTION OF FUTURE POPULATION Ben David Nissim Deartment of Economics and Management,

More information

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality Point Estimation Some General Concepts of Point Estimation Statistical inference = conclusions about parameters Parameters == population characteristics A point estimate of a parameter is a value (based

More information

Chapter 1: Stochastic Processes

Chapter 1: Stochastic Processes Chater 1: Stochastic Processes 4 What are Stochastic Processes, and how do they fit in? STATS 210 Foundations of Statistics and Probability Tools for understanding randomness (random variables, distributions)

More information

Chapter 6: The Normal Distribution

Chapter 6: The Normal Distribution Chapter 6: The Normal Distribution Diana Pell Section 6.1: Normal Distributions Note: Recall that a continuous variable can assume all values between any two given values of the variables. Many continuous

More information

FUNDAMENTAL ECONOMICS - Economics Of Uncertainty And Information - Giacomo Bonanno ECONOMICS OF UNCERTAINTY AND INFORMATION

FUNDAMENTAL ECONOMICS - Economics Of Uncertainty And Information - Giacomo Bonanno ECONOMICS OF UNCERTAINTY AND INFORMATION ECONOMICS OF UNCERTAINTY AND INFORMATION Giacomo Bonanno Deartment of Economics, University of California, Davis, CA 9566-8578, USA Keywords: adverse selection, asymmetric information, attitudes to risk,

More information

Application of Monte-Carlo Tree Search to Traveling-Salesman Problem

Application of Monte-Carlo Tree Search to Traveling-Salesman Problem R4-14 SASIMI 2016 Proceedings Alication of Monte-Carlo Tree Search to Traveling-Salesman Problem Masato Shimomura Yasuhiro Takashima Faculty of Environmental Engineering University of Kitakyushu Kitakyushu,

More information

Chapter 6: The Normal Distribution

Chapter 6: The Normal Distribution Chapter 6: The Normal Distribution Diana Pell Section 6.1: Normal Distributions Note: Recall that a continuous variable can assume all values between any two given values of the variables. Many continuous

More information

Professor Huihua NIE, PhD School of Economics, Renmin University of China HOLD-UP, PROPERTY RIGHTS AND REPUTATION

Professor Huihua NIE, PhD School of Economics, Renmin University of China   HOLD-UP, PROPERTY RIGHTS AND REPUTATION Professor uihua NIE, PhD School of Economics, Renmin University of China E-mail: niehuihua@gmail.com OD-UP, PROPERTY RIGTS AND REPUTATION Abstract: By introducing asymmetric information of investors abilities

More information

Quantitative Aggregate Effects of Asymmetric Information

Quantitative Aggregate Effects of Asymmetric Information Quantitative Aggregate Effects of Asymmetric Information Pablo Kurlat February 2012 In this note I roose a calibration of the model in Kurlat (forthcoming) to try to assess the otential magnitude of the

More information

Interval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems

Interval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems Interval estimation September 29, 2017 STAT 151 Class 7 Slide 1 Outline of Topics 1 Basic ideas 2 Sampling variation and CLT 3 Interval estimation using X 4 More general problems STAT 151 Class 7 Slide

More information

CH 5 Normal Probability Distributions Properties of the Normal Distribution

CH 5 Normal Probability Distributions Properties of the Normal Distribution Properties of the Normal Distribution Example A friend that is always late. Let X represent the amount of minutes that pass from the moment you are suppose to meet your friend until the moment your friend

More information

Chapter 11: Inference for Distributions Inference for Means of a Population 11.2 Comparing Two Means

Chapter 11: Inference for Distributions Inference for Means of a Population 11.2 Comparing Two Means Chapter 11: Inference for Distributions 11.1 Inference for Means of a Population 11.2 Comparing Two Means 1 Population Standard Deviation In the previous chapter, we computed confidence intervals and performed

More information

Chapter 5. Sampling Distributions

Chapter 5. Sampling Distributions Lecture notes, Lang Wu, UBC 1 Chapter 5. Sampling Distributions 5.1. Introduction In statistical inference, we attempt to estimate an unknown population characteristic, such as the population mean, µ,

More information

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage 6 Point Estimation Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage Point Estimation Statistical inference: directed toward conclusions about one or more parameters. We will use the generic

More information

Do Poorer Countries Have Less Capacity for Redistribution?

Do Poorer Countries Have Less Capacity for Redistribution? Public Disclosure Authorized Public Disclosure Authorized Public Disclosure Authorized Public Disclosure Authorized Policy Research Working Paer 5046 Do Poorer Countries Have Less Caacity for Redistribution?

More information

Sampling and sampling distribution

Sampling and sampling distribution Sampling and sampling distribution September 12, 2017 STAT 101 Class 5 Slide 1 Outline of Topics 1 Sampling 2 Sampling distribution of a mean 3 Sampling distribution of a proportion STAT 101 Class 5 Slide

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://wwwstattamuedu/~suhasini/teachinghtml Suhasini Subba Rao Review of previous lecture The main idea in the previous lecture is that the sample

More information

Midterm Exam III Review

Midterm Exam III Review Midterm Exam III Review Dr. Joseph Brennan Math 148, BU Dr. Joseph Brennan (Math 148, BU) Midterm Exam III Review 1 / 25 Permutations and Combinations ORDER In order to count the number of possible ways

More information

The Impact of Flexibility And Capacity Allocation On The Performance of Primary Care Practices

The Impact of Flexibility And Capacity Allocation On The Performance of Primary Care Practices University of Massachusetts Amherst ScholarWorks@UMass Amherst Masters Theses 1911 - February 2014 2010 The Imact of Flexibility And Caacity Allocation On The Performance of Primary Care Practices Liang

More information

Sampling Distributions

Sampling Distributions AP Statistics Ch. 7 Notes Sampling Distributions A major field of statistics is statistical inference, which is using information from a sample to draw conclusions about a wider population. Parameter:

More information

The Normal Distribution

The Normal Distribution 5.1 Introduction to Normal Distributions and the Standard Normal Distribution Section Learning objectives: 1. How to interpret graphs of normal probability distributions 2. How to find areas under the

More information

Homework 10 Solution Section 4.2, 4.3.

Homework 10 Solution Section 4.2, 4.3. MATH 00 Homewor Homewor 0 Solution Section.,.3. Please read your writing again before moving to the next roblem. Do not abbreviate your answer. Write everything in full sentences. Write your answer neatly.

More information

5.1 Mean, Median, & Mode

5.1 Mean, Median, & Mode 5.1 Mean, Median, & Mode definitions Mean: Median: Mode: Example 1 The Blue Jays score these amounts of runs in their last 9 games: 4, 7, 2, 4, 10, 5, 6, 7, 7 Find the mean, median, and mode: Example 2

More information

Statistics and Probability

Statistics and Probability Statistics and Probability Continuous RVs (Normal); Confidence Intervals Outline Continuous random variables Normal distribution CLT Point estimation Confidence intervals http://www.isrec.isb-sib.ch/~darlene/geneve/

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 10 (MWF) Checking for normality of the data using the QQplot Suhasini Subba Rao Review of previous

More information

Business Statistics 41000: Probability 4

Business Statistics 41000: Probability 4 Business Statistics 41000: Probability 4 Drew D. Creal University of Chicago, Booth School of Business February 14 and 15, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office:

More information

Analysing indicators of performance, satisfaction, or safety using empirical logit transformation

Analysing indicators of performance, satisfaction, or safety using empirical logit transformation Analysing indicators of erformance, satisfaction, or safety using emirical logit transformation Sarah Stevens,, Jose M Valderas, Tim Doran, Rafael Perera,, Evangelos Kontoantelis,5 Nuffield Deartment of

More information

8.1 Estimation of the Mean and Proportion

8.1 Estimation of the Mean and Proportion 8.1 Estimation of the Mean and Proportion Statistical inference enables us to make judgments about a population on the basis of sample information. The mean, standard deviation, and proportions of a population

More information

CS522 - Exotic and Path-Dependent Options

CS522 - Exotic and Path-Dependent Options CS522 - Exotic and Path-Deendent Otions Tibor Jánosi May 5, 2005 0. Other Otion Tyes We have studied extensively Euroean and American uts and calls. The class of otions is much larger, however. A digital

More information

Swings in the Economic Support Ratio and Income Inequality by Sang-Hyop Lee and Andrew Mason 1

Swings in the Economic Support Ratio and Income Inequality by Sang-Hyop Lee and Andrew Mason 1 Swings in the Economic Suort Ratio and Income Inequality by Sang-Hyo Lee and Andrew Mason 1 Draft May 3, 2002 When oulations are young, income inequality deends on the distribution of earnings and wealth

More information

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS Part 1: Introduction Sampling Distributions & the Central Limit Theorem Point Estimation & Estimators Sections 7-1 to 7-2 Sample data

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 10 (MWF) Checking for normality of the data using the QQplot Suhasini Subba Rao Checking for

More information

EVIDENCE OF ADVERSE SELECTION IN CROP INSURANCE MARKETS

EVIDENCE OF ADVERSE SELECTION IN CROP INSURANCE MARKETS The Journal of Risk and Insurance, 2001, Vol. 68, No. 4, 685-708 EVIDENCE OF ADVERSE SELECTION IN CROP INSURANCE MARKETS Shiva S. Makki Agai Somwaru INTRODUCTION ABSTRACT This article analyzes farmers

More information

Appendix Large Homogeneous Portfolio Approximation

Appendix Large Homogeneous Portfolio Approximation Aendix Large Homogeneous Portfolio Aroximation A.1 The Gaussian One-Factor Model and the LHP Aroximation In the Gaussian one-factor model, an obligor is assumed to default if the value of its creditworthiness

More information

Individual Comparative Advantage and Human Capital Investment under Uncertainty

Individual Comparative Advantage and Human Capital Investment under Uncertainty Individual Comarative Advantage and Human Caital Investment under Uncertainty Toshihiro Ichida Waseda University July 3, 0 Abstract Secialization and the division of labor are the sources of high roductivity

More information

1 < = α σ +σ < 0. Using the parameters and h = 1/365 this is N ( ) = If we use h = 1/252, the value would be N ( ) =

1 < = α σ +σ < 0. Using the parameters and h = 1/365 this is N ( ) = If we use h = 1/252, the value would be N ( ) = Chater 6 Value at Risk Question 6.1 Since the rice of stock A in h years (S h ) is lognormal, 1 < = α σ +σ < 0 ( ) P Sh S0 P h hz σ α σ α = P Z < h = N h. σ σ (1) () Using the arameters and h = 1/365 this

More information

Part V - Chance Variability

Part V - Chance Variability Part V - Chance Variability Dr. Joseph Brennan Math 148, BU Dr. Joseph Brennan (Math 148, BU) Part V - Chance Variability 1 / 78 Law of Averages In Chapter 13 we discussed the Kerrich coin-tossing experiment.

More information

FEEG6017 lecture: The normal distribution, estimation, confidence intervals. Markus Brede,

FEEG6017 lecture: The normal distribution, estimation, confidence intervals. Markus Brede, FEEG6017 lecture: The normal distribution, estimation, confidence intervals. Markus Brede, mb8@ecs.soton.ac.uk The normal distribution The normal distribution is the classic "bell curve". We've seen that

More information

Statistical Intervals (One sample) (Chs )

Statistical Intervals (One sample) (Chs ) 7 Statistical Intervals (One sample) (Chs 8.1-8.3) Confidence Intervals The CLT tells us that as the sample size n increases, the sample mean X is close to normally distributed with expected value µ and

More information

A NOTE ON SKEW-NORMAL DISTRIBUTION APPROXIMATION TO THE NEGATIVE BINOMAL DISTRIBUTION

A NOTE ON SKEW-NORMAL DISTRIBUTION APPROXIMATION TO THE NEGATIVE BINOMAL DISTRIBUTION A NOTE ON SKEW-NORMAL DISTRIBUTION APPROXIMATION TO THE NEGATIVE BINOMAL DISTRIBUTION JYH-JIUAN LIN 1, CHING-HUI CHANG * AND ROSEMARY JOU 1 Deartment of Statistics Tamkang University 151 Ying-Chuan Road,

More information

CHAPTER 5 SAMPLING DISTRIBUTIONS

CHAPTER 5 SAMPLING DISTRIBUTIONS CHAPTER 5 SAMPLING DISTRIBUTIONS Sampling Variability. We will visualize our data as a random sample from the population with unknown parameter μ. Our sample mean Ȳ is intended to estimate population mean

More information

Games with more than 1 round

Games with more than 1 round Games with more than round Reeated risoner s dilemma Suose this game is to be layed 0 times. What should you do? Player High Price Low Price Player High Price 00, 00-0, 00 Low Price 00, -0 0,0 What if

More information

Introduction to Statistics I

Introduction to Statistics I Introduction to Statistics I Keio University, Faculty of Economics Continuous random variables Simon Clinet (Keio University) Intro to Stats November 1, 2018 1 / 18 Definition (Continuous random variable)

More information

: now we have a family of utility functions for wealth increments z indexed by initial wealth w.

: now we have a family of utility functions for wealth increments z indexed by initial wealth w. Lotteries with Money Payoffs, continued Fix u, let w denote wealth, and set u ( z) u( z w) : now we have a family of utility functions for wealth increments z indexed by initial wealth w. (a) Recall from

More information

ON JARQUE-BERA TESTS FOR ASSESSING MULTIVARIATE NORMALITY

ON JARQUE-BERA TESTS FOR ASSESSING MULTIVARIATE NORMALITY Journal of Statistics: Advances in Theory and Alications Volume, umber, 009, Pages 07-0 O JARQUE-BERA TESTS FOR ASSESSIG MULTIVARIATE ORMALITY KAZUYUKI KOIZUMI, AOYA OKAMOTO and TAKASHI SEO Deartment of

More information

As you draw random samples of size n, as n increases, the sample means tend to be normally distributed.

As you draw random samples of size n, as n increases, the sample means tend to be normally distributed. The Central Limit Theorem The central limit theorem (clt for short) is one of the most powerful and useful ideas in all of statistics. The clt says that if we collect samples of size n with a "large enough

More information

Key Objectives. Module 2: The Logic of Statistical Inference. Z-scores. SGSB Workshop: Using Statistical Data to Make Decisions

Key Objectives. Module 2: The Logic of Statistical Inference. Z-scores. SGSB Workshop: Using Statistical Data to Make Decisions SGSB Workshop: Using Statistical Data to Make Decisions Module 2: The Logic of Statistical Inference Dr. Tom Ilvento January 2006 Dr. Mugdim Pašić Key Objectives Understand the logic of statistical inference

More information

1. Variability in estimates and CLT

1. Variability in estimates and CLT Unit3: Foundationsforinference 1. Variability in estimates and CLT Sta 101 - Fall 2015 Duke University, Department of Statistical Science Dr. Çetinkaya-Rundel Slides posted at http://bit.ly/sta101_f15

More information

STAT 201 Chapter 6. Distribution

STAT 201 Chapter 6. Distribution STAT 201 Chapter 6 Distribution 1 Random Variable We know variable Random Variable: a numerical measurement of the outcome of a random phenomena Capital letter refer to the random variable Lower case letters

More information

Chapter Seven: Confidence Intervals and Sample Size

Chapter Seven: Confidence Intervals and Sample Size Chapter Seven: Confidence Intervals and Sample Size A point estimate is: The best point estimate of the population mean µ is the sample mean X. Three Properties of a Good Estimator 1. Unbiased 2. Consistent

More information

Statistics 511 Supplemental Materials

Statistics 511 Supplemental Materials Gaussian (or Normal) Random Variable In this section we introduce the Gaussian Random Variable, which is more commonly referred to as the Normal Random Variable. This is a random variable that has a bellshaped

More information

Statistics 431 Spring 2007 P. Shaman. Preliminaries

Statistics 431 Spring 2007 P. Shaman. Preliminaries Statistics 4 Spring 007 P. Shaman The Binomial Distribution Preliminaries A binomial experiment is defined by the following conditions: A sequence of n trials is conducted, with each trial having two possible

More information

2/20/2013. of Manchester. The University COMP Building a yes / no classifier

2/20/2013. of Manchester. The University COMP Building a yes / no classifier COMP4 Lecture 6 Building a yes / no classifier Buildinga feature-basedclassifier Whatis a classifier? What is an information feature? Building a classifier from one feature Probability densities and the

More information

Nicole Dalzell. July 7, 2014

Nicole Dalzell. July 7, 2014 UNIT 2: PROBABILITY AND DISTRIBUTIONS LECTURE 2: NORMAL DISTRIBUTION STATISTICS 101 Nicole Dalzell July 7, 2014 Announcements Short Quiz Today Statistics 101 (Nicole Dalzell) U2 - L2: Normal distribution

More information

STA 320 Fall Thursday, Dec 5. Sampling Distribution. STA Fall

STA 320 Fall Thursday, Dec 5. Sampling Distribution. STA Fall STA 320 Fall 2013 Thursday, Dec 5 Sampling Distribution STA 320 - Fall 2013-1 Review We cannot tell what will happen in any given individual sample (just as we can not predict a single coin flip in advance).

More information

Chapter 7: Point Estimation and Sampling Distributions

Chapter 7: Point Estimation and Sampling Distributions Chapter 7: Point Estimation and Sampling Distributions Seungchul Baek Department of Statistics, University of South Carolina STAT 509: Statistics for Engineers 1 / 20 Motivation In chapter 3, we learned

More information

Summary of the Chief Features of Alternative Asset Pricing Theories

Summary of the Chief Features of Alternative Asset Pricing Theories Summary o the Chie Features o Alternative Asset Pricing Theories CAP and its extensions The undamental equation o CAP ertains to the exected rate o return time eriod into the uture o any security r r β

More information

Lecture 8 - Sampling Distributions and the CLT

Lecture 8 - Sampling Distributions and the CLT Lecture 8 - Sampling Distributions and the CLT Statistics 102 Kenneth K. Lopiano September 18, 2013 1 Basics Improvements 2 Variability of Estimates Activity Sampling distributions - via simulation Sampling

More information

10/1/2012. PSY 511: Advanced Statistics for Psychological and Behavioral Research 1

10/1/2012. PSY 511: Advanced Statistics for Psychological and Behavioral Research 1 PSY 511: Advanced Statistics for Psychological and Behavioral Research 1 Pivotal subject: distributions of statistics. Foundation linchpin important crucial You need sampling distributions to make inferences:

More information

6.4 Confidence Interval for the Difference between Two Population Means

6.4 Confidence Interval for the Difference between Two Population Means 6.4 Confidence Interval for the Difference between Two Poulation Means If we draw two samles from two indeendent oulations with means and and variances σ and σ, resectively, and we want to construct the

More information

Dr. Maddah ENMG 625 Financial Eng g II 10/16/06. Chapter 11 Models of Asset Dynamics (1)

Dr. Maddah ENMG 625 Financial Eng g II 10/16/06. Chapter 11 Models of Asset Dynamics (1) Dr Maddah ENMG 65 Financial Eng g II 0/6/06 Chater Models of Asset Dynamics () Overview Stock rice evolution over time is commonly modeled with one of two rocesses: The binomial lattice and geometric Brownian

More information

Math 227 Elementary Statistics. Bluman 5 th edition

Math 227 Elementary Statistics. Bluman 5 th edition Math 227 Elementary Statistics Bluman 5 th edition CHAPTER 6 The Normal Distribution 2 Objectives Identify distributions as symmetrical or skewed. Identify the properties of the normal distribution. Find

More information

Information and uncertainty in a queueing system

Information and uncertainty in a queueing system Information and uncertainty in a queueing system Refael Hassin December 7, 7 Abstract This aer deals with the effect of information and uncertainty on rofits in an unobservable single server queueing system.

More information

Chapter 8 Estimation

Chapter 8 Estimation Chapter 8 Estimation There are two important forms of statistical inference: estimation (Confidence Intervals) Hypothesis Testing Statistical Inference drawing conclusions about populations based on samples

More information

Confidence Intervals and Sample Size

Confidence Intervals and Sample Size Confidence Intervals and Sample Size Chapter 6 shows us how we can use the Central Limit Theorem (CLT) to 1. estimate a population parameter (such as the mean or proportion) using a sample, and. determine

More information

Annex 4 - Poverty Predictors: Estimation and Algorithm for Computing Predicted Welfare Function

Annex 4 - Poverty Predictors: Estimation and Algorithm for Computing Predicted Welfare Function Annex 4 - Poverty Predictors: Estimation and Algorithm for Comuting Predicted Welfare Function The Core Welfare Indicator Questionnaire (CWIQ) is an off-the-shelf survey ackage develoed by the World Bank

More information

Sampling Distributions and the Central Limit Theorem

Sampling Distributions and the Central Limit Theorem Sampling Distributions and the Central Limit Theorem February 18 Data distributions and sampling distributions So far, we have discussed the distribution of data (i.e. of random variables in our sample,

More information

Chapter 7. Sampling Distributions

Chapter 7. Sampling Distributions Chapter 7 Sampling Distributions Section 7.1 Sampling Distributions and the Central Limit Theorem Sampling Distributions Sampling distribution The probability distribution of a sample statistic. Formed

More information

Statistics and Probability Letters. Variance stabilizing transformations of Poisson, binomial and negative binomial distributions

Statistics and Probability Letters. Variance stabilizing transformations of Poisson, binomial and negative binomial distributions Statistics and Probability Letters 79 (9) 6 69 Contents lists available at ScienceDirect Statistics and Probability Letters journal homeage: www.elsevier.com/locate/staro Variance stabilizing transformations

More information

CHAPTER 5 Sampling Distributions

CHAPTER 5 Sampling Distributions CHAPTER 5 Sampling Distributions 5.1 The possible values of p^ are 0, 1/3, 2/3, and 1. These correspond to getting 0 persons with lung cancer, 1 with lung cancer, 2 with lung cancer, and all 3 with lung

More information

Lecture 9. Probability Distributions. Outline. Outline

Lecture 9. Probability Distributions. Outline. Outline Outline Lecture 9 Probability Distributions 6-1 Introduction 6- Probability Distributions 6-3 Mean, Variance, and Expectation 6-4 The Binomial Distribution Outline 7- Properties of the Normal Distribution

More information

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Lecture - 05 Normal Distribution So far we have looked at discrete distributions

More information

The Normal Distribution

The Normal Distribution Stat 6 Introduction to Business Statistics I Spring 009 Professor: Dr. Petrutza Caragea Section A Tuesdays and Thursdays 9:300:50 a.m. Chapter, Section.3 The Normal Distribution Density Curves So far we

More information

Chapter 7: Sampling Distributions Chapter 7: Sampling Distributions

Chapter 7: Sampling Distributions Chapter 7: Sampling Distributions Chapter 7: Sampling Distributions Objectives: Students will: Define a sampling distribution. Contrast bias and variability. Describe the sampling distribution of a proportion (shape, center, and spread).

More information

Business Statistics 41000: Probability 3

Business Statistics 41000: Probability 3 Business Statistics 41000: Probability 3 Drew D. Creal University of Chicago, Booth School of Business February 7 and 8, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office: 404

More information

Limitations of Value-at-Risk (VaR) for Budget Analysis

Limitations of Value-at-Risk (VaR) for Budget Analysis Agribusiness & Alied Economics March 2004 Miscellaneous Reort No. 194 Limitations of Value-at-Risk (VaR) for Budget Analysis Cole R. Gustafson Deartment of Agribusiness and Alied Economics Agricultural

More information

1 Sampling Distributions

1 Sampling Distributions 1 Sampling Distributions 1.1 Statistics and Sampling Distributions When a random sample is selected the numerical descriptive measures calculated from such a sample are called statistics. These statistics

More information

ADB Working Paper Series on Regional Economic Integration. Methods for Ex Post Economic Evaluation of Free Trade Agreements

ADB Working Paper Series on Regional Economic Integration. Methods for Ex Post Economic Evaluation of Free Trade Agreements ADB Working Paer Series on Regional Economic Integration Methods for Ex Post Economic Evaluation of Free Trade Agreements David Cheong No. 59 October 2010 ADB Working Paer Series on Regional Economic

More information