Solutions for practice questions: Chapter 9, Statistics
|
|
- Dustin Thomas
- 5 years ago
- Views:
Transcription
1 Solutions for practice questions: Chapter 9, Statistics If you find any errors, please let me know at 1. We know that µ is the mean of 30 values of y, i= 1 2 ( y i µ ) = 925. i= 1 y = 360, and i a) The mean can be calculated by adding up all the values and dividing by the number of values. Conveniently, we are given both the fact that there are 30 values (that s what numbering them from 1 to 30 does for you) and that the sum of the values is 360 (that s the first summation). 360 Therefore µ = = b) The standard deviation is probably not a formula you can just recite in the way you can the mean. However, there is a way to get it without the raw data. Back on p. 294, you ll find two different formulas which work out to the very same thing. One is called the sample standard deviation, and the other is population standard deviation, but in the world of IB, 1 they re computed identically: = n i= 1 σ ( x µ ) 2 i. Notice that n the numerator of that fraction is the other given sum. 925 So σ = But not in the world of AP Statistics so if you live in the world of AP Stats, too, you need to know the difference. In normal-not-ib-stats, the sample standard deviation has n 1 in the denominator. 2. So we know the mean of the data set, but not the number in the group that took 40 minutes to get to school in the morning. The mean is the sum of the times divided by the total number of students. I ll call the missing value m, for missing m = m m = m m = 34(11+ m) = m 6m = 24, and m = 4 students 3. Fifty measurements. Great. a) The question specifies that the first interval should start at 1.6, and that the intervals should be 0.5 units wide. (Shouldn t there be units on time? Whatever. 2 ) Because I write my intervals both carefully and correctly, I know that I should use inequalities, like 1.6 x < 2.1, and so on. You know that, right? I am using x to represent the time. 3 Time Taken Frequency 1.6 x < x < x < x < x < x < x < x < x < x < But while I m complaining, why are we starting at 1.6 and not 1.5? The graph would make more sense to me if we had. 3 Not only did BoB not write the intervals correctly, his counts are off. These are correct.
2 Rather than drawing the histogram by hand, I am using technology (for what I am assuming are obvious reasons related to how you are reading this). You should be able to set this up on graph paper, with a regular, ordinary x-axis, and using a straightedge for the axes and the bars. be expected to find with a calculator, so there s no reason not to use it to find the mean, too. b) All but seven of the values are less than 5.1, so that would be a fraction of 43 less than c) Based on the squares I can count, there are 19 values to the right of the 3.6 x < 4.1 class, and 17 to its left. So the median should be somewhere near the middle of that class. To be more precise than somewhere near the middle, of the 14 pieces of data in the class, eight must be less than the median and 6 greater than the median. The median is between the eighth and ninth values. So we need a number which is 8.5 of the way up the 14 interval, from the bottom. The intervals are 0.5 wide, so this 8.5 gives d) The standard deviation is certainly something you would Notice that the numbers in the time column are the midpoints of each interval. I used the onevariable statistics command with one list and frequencies in the list I ve titled freq. The mean is the 3.93 that s highlighted; the standard deviation we need is σ, not s. So σ e) A cumulative frequency graph starts with a cumulative frequency table. Cum. Freq Time Taken Freq 1.6 x < x < x < x < x < x < x < x < x < x < Then we plot the right end of each interval with the cumulative frequency values as points, along with a 0 at the left end of the first interval, and connect with a curve.
3 4. The data are from cities chosen at random. Did you know that 74.3% of all statistics are made up on the spot? f) The five-number summary uses the minimum, maximum, median, and quartiles. The minimum is 1.6 and the maximum is 6.6. For the others, we use fractions of the 50 total frequency. The median will be at 25, the lower quartile at 12.5, and the upper quartile at Draw across from the frequency and down to the time. Here s another picture. 4 a) This one sounds like purely opinion, but of mean, median, and mode, mean seems clearly to be the worst choice; the data are clearly skewed positively. Either the median or the mode seems fine to me, but because there are two modes, I d probably go with median. Similarly, standard deviation and range would be affected a lot by the outlier(s) on the right side, so interquartile range seems like a better measure of spread. b) The mean and standard deviation will come from a calculator. The number of parking spaces will be the values, the number of sites make up the frequencies. Judging from those vertical segments, I d posit the first quartile to be 3.1, the median to be 3.8, and the third quartile to be While in general, graphs are easier using a computer, this was a huge pain to make; I had to draw the curve in another program and import it as a graphic to GeoGebra, because the drawing tools in GeoGebra are just not meant for this. I should have just done it on graph paper and scanned it in. I get a mean of 783 spaces, and a standard deviation of 536 spaces (both to 3 s. f.). 5 5 BoB s answer for the mean just doesn t make sense. The graph clearly shows that the first bar is centered
4 c) Hmmm. Another cumulative frequency graph. I think this time I ll look for another piece of software. In the meantime, here s the cumulative frequency table. Parking Spaces Freq CF 150 x < x < x < x < x < x < x < x < x < x < x < x < x < x < x < x < x < x < x < x < x < x < x < x < x < x < I went with Excel. around 200, not 100. I stand by my answer, but if you can explain why I would be wrong, please do. 6 Yes, I considered leaving out the rows with the zeros, but elected not to, as I thought it would raise more questions than it was worth. d) The quartiles and median come from drawing on the graph. There are 460 sites, so the median should be at 230, and the quartiles at 115 and 345. I added some grid lines to make this easier to read. I get the median 580, Q1 425, and Q3 = 925. And therefore the interquartile range is about = 500. e) Outliers are defined as being more than 1.5 times the interquartile range away from the nearer quartile. Q1 1.5(IQR) = 325, which is clearly not a possibility. Q (IQR) = This value is barely in the 1650 x < 1750 bar. So the highest 30 values are outliers, as are some of those in the 20 in that interval. As 1675 is ¼ of the way through that interval, I ll take 15 of those, for a total of 45 outliers. f) Well, as I previously said, it s skewed positively and is bimodal (two of the bars are the same height). It has about 45 outliers on the high end, and none on the low. And if I count this one as the third sentence, then I ve now written a few sentences in the description. 5. a) The Spanish wines include both the most expensive and the least expensive case.
5 b) Red wines are generally more expensive in France, where the top half of the prices are more expensive than the vast majority of the wines in the other two countries. c) Spanish wines have the greatest range, but the least expensive median; if you are going to put a red wine on the table every evening, Spanish wines would be a good choice. French wines are the most expensive overall, with 75% of the cases above the median of prices for Italian and Spanish wines. The Spanish wines could be described as holding the middle ground. 6. I m only on question 6. There are 26 of these. Does anyone read this stuff? a) Using a calculator µ 52.6 s, σ 7.60 s b) Still with the calculator next closer time, both the mean and the standard deviation were much larger than if that one value were left out Interesting question we have the summary statistics, and we re asked about features of the distribution. a) If the distribution is symmetric, then the distance from each quartile to the corresponding extreme value should be approximately equal. The maximum minus the third quartile is = 10.25, and the first quartile minus the minimum is = As the second is nearly twice the first, I would say that the distribution is not symmetric. b) Outliers are more than 1.5 times the interquartile range away from the nearer of Q1 and Q3. The IQR is = 15.5; = Subtracting, Q = 37; the minimum is not less than 37. Then Q = 99; the maximum is not greater than 99. There are no outliers. c) I m going to do the box plot on the TI-nspire. I will just enter the data as the five-number summary. median = s (exactly) IQR = Q3 Q1 = = 2.65 s. c) Because one of the times was about 50 seconds greater than the I was initially surprised to discover that I had to enter each 7 This sounded familiar to me, so I looked it up. To read about Eric the Eel, try this link: s/1807.asp
6 of the quartiles twice to get them to graph correctly, but after thinking about it a bit, this made sense. Try it yourself if you don t see why this was necessary. d) The data is skewed somewhat to the left. The middle half of the data is fairly compact, but even so the ends are not so spread as for there to be outliers. and there are 2000 people, it s 1000 patients. 8 d) For the box plot, it s back to the TI-nspire. The minimum and maximum values are 100 and 400 mg/dl respectively. (The blue line appears to go all the way back to 100 on the horizontal axis. I had to look closely.) 8. a) The median will happen at the 50th percentile. Since the y-axis is given as percentages, this is easy. e) The distribution is skewed somewhat to the right, with a relatively small interquartile range. The maximum and minimum are both outliers. The median is 225 mg/dl. b) The quartiles are at 25% and 75%. I ll do those in blue and the 90th and 10th in red. 9. a) I counted by fives. If you use something else, your answers may differ from mine. 9 Speed Frequency 25 x < x < x < x < x < x < 55 2 b) Histogram from the calculator: First quartile: 205 Third quartile: th percentile: th percentile: 298 No wonder these people are heart patients. Their cholesterol levels are crazy. c) The IQR would be = 48. Since this is the middle 50%, 8 BoB answered some other question? 9 BoB chose different boundaries than I did, too, so our answers aren t going to be quite the same. And of course he set up the intervals badly. Speed is continuous data; it needs inequalities.
7 c) I am assuming that showing all work means that I am to do this with the frequency distribution. To show all work for 100 individual data points is just absurd. (In fact, showing all work in general on this is not my idea of useful activity. Feel free not to do so yourself.) The mean, at least, is easy. I need the sum of the values divided by the number of values x 100 = 38.6 km/h. Note that this is really an approximation rather than an exact value (see the?) because the midpoints of the intervals are approximating the true values. Standard deviation is far more annoying. I am modeling my table on the one on p Midpt Freq squared times freq m f(m) (m xbar) 2 ( ) 2 freq sum f(m) sum of sq diff times freq standard dev is square root of green over blue Ugh. So σ 5.77 km/h. d) Cumulative frequency table: Speed Freq C. F. 25 x < x < x < x < x < x < e) So the median and quartiles come from a graph. This is some serious overkill. Median = 38 Q1 = 35 Q3 = 42.3 IQR = = 7.3 f) 1.5 IQR = = for the lower boundary for outliers; = for the upper boundary. So apparently there is one outlier (54) on the high side, and none on the low. My calculator made you a box plot (using the entire data set, because that s how hard I intend to work here). 10. a) I ll be doing these on the calculator, as any sane person would.
8 mean = L median = L standard deviation L Q1 = 1714 L Q3 = 2024 L IQR = = 310 L 10 b) Q1 1.5IQR = 1249 Q IQR = The maximum value is less than the upper end boundary, so there are no outliers on the right. However, the minimum value is less than the lower boundary; as it turns out, there are two values that qualify as outliers on the low end, 1123 and c) From the Nspire: d) µ σ 1615 L µ + σ 2078 L So consumption levels between 1615 and 2078 L are within one standard deviation of the mean. e) Germany s 2758 L would be a significant outlier on the high end. It will not significantly affect the median, quartiles, or IQR, but the mean and standard deviation will both be larger. 10 BoB s answers are a little different from mine. I get the impression (from his median and quartile values) that he may have made a frequency distribution and used that; I can t tell for sure. 11. a) Ninety students had a total of 4460 minutes of travel time, so on average, they spent minutes each. 90 b) With the new times, the total is now = 4594, and there are now 94 students; the new mean is min. 12. a) I m going to make the table vertical so it fits in my twocolumn format better. The 1-10 notation that the book uses here is actually not terrible, because the scores are discrete. They do make the mid-interval values more ugly, though, with decimals. I m going to go with regular intervals, because I know that s best. Notice that the intervals I use do include the discrete values in the original table, with the equals on the right side. Marks Freq C. F. 0 < x < x < x < x < x < x < x < x < x < x b) The scale that is requested seems too big to put here. What I m going to do is make it on graph paper at the required size, then scan it and shrink it to fit in this document.
9 c) To answer these, I have drawn on the graph from (b), at right. Look there to see where the numbers came from. i) Median is at the middle student, number The value is 47 marks. ii) About 455 students have to do the retake. iii) Fifteen percent of 2000 is 300; the top 300 students are above It looks like a mark of 64 or higher gets honors (aka distinction ) Weighted averages! The 72 men have a mean height of 1.79 m, so their total height is = m; for the women, the total is = m. So if you stacked them all on top of each other, you d have = m of mathematicians. As there are 100 of them, they each average m tall. 14. a) The mean is the sum of all the values divided by the number of values. The sum of the values is given as 300, and we can tell from the subscripts that there are 25 values. 300 So m = = b) To get the standard deviation from this information, you need the formula = N i= 1 σ ( x µ ) 2 i (which you will find on p. 293). That formula (with lower case n, N I think) used to be in the formula booklet. However, in the newest syllabus, its use is no longer tested, so you wouldn t have to look it up or memorize it for the IB exam. 11 Yeah, I know, it really ought to be between student 1000 and Can you tell the difference on this scale? Neither can anyone else. Just use When you draw a graph like this, there will always be some variation between one person s result and another s. In a markscheme, the reader will use your graph to check your numbers.
10 Note that this doesn t mean this question is bad, just that someone should provide you with the formula before asking you to do it. Since m is the mean, 625 is the numerator of that fraction. The value of n (or N) is still σ = = 25 = a) The appropriate approximation they re referring to is the midinterval value. That would be 15, 45, and so on. The mean is 97.2 seconds. b) Oh, I feel more graph paper coming. Wait time Freq C.F. 0 x < x < x < x < x < x < x < x < c) I just counted how much more graph paper I m going to need. Ugh. As it turns out, in recent years the exams have shown a trend away from making you construct graphs like these and toward having you read them. That makes sense; drawing them takes a lot of time, and they can test more material if they don t make you do all of that stuff. But I have told myself I m going to work all of these problems, so here s another graph. d) The median is at customer number 50; the quartiles are at 25 and 75. Nice numbers. Median 88 s Lower quartile 66 s Upper quartile 122 s 16. a) i) Looks like 10 plants. ii) = 24 plants. b) Again, the midinterval values are used for the computations.
11 c) Hey, I don t have to draw the graph this time! µ = 63 cm σ 20.5 cm c) The graph shows that the distribution is skewed to the left; the left tail is longer than the right. When a distribution is not symmetric, the median and mean are different. d) There are 80 plants (it was given at the top of the question). So for the median, we look at 40 plants. According to the table, that puts it greater than 60 cm and less than 70. In fact, 40 is exactly the mean of 32 and 48, which means the median can be estimated as the mean of 60 and 70 cm, or as 65 cm. 17. a) Once more with the one-variable statistics. σ 7.41 g. Notice that the units are grams, just like the weights. b) I made it vertical. Weight (W) Freq C.F. 80 < W < W < W < W < W < W < W i) The median is at 40 packets; median 97 g. ii) The upper quartile is at 60 packets; upper quartile 101 g. d) At first this question may seem impossible to answer. If you don t know any of the individual weights 13, how can you know this? But watch the algebraic magic W W + W W + ( 1 ) ( 2 ) + ( W79 W ) + ( W80 W ) = ( W1 + W2 + + W80 ) ( W + + W ) 80 terms 80 = Wi 80W i= 1 Notice that the mean is calculated by adding the values of Wi and dividing by 80. Making that substitution, we have 13 I know, grams mean mass. This is how the question was written on the exam, too.
12 = Wi 80 Wi i= 1 80 i= = W W = 0. i i= 1 i= 1 Brilliant. e) This part of the question assumes you ve already studied probability. We haven t gotten there yet, but it s not that hard. We know that W is between 85 and 110. That s all the packets except the 5 in the lightest group and the 4 in the heaviest, which is 71 packets. Of those, there are = 20 over 100 g. So the probability is c) i) As seen in red on the graph, 200 cars travel at speeds of 105 km/h or less, so 100 travel at greater than that speed. That s one-third of the total, or 33.3% (to three significant figures). ii) i Fifteen percent of 300 is 45 cars. Forty-five from the top would be at the cumulative frequency of 255. As you can see in purple on the cumulative frequency curve, this corresponds to about 114 km/h. 18. At first, the top and bottom rows of the table seem to be different widths, but the frequencies there are 0, so they don t introduce any complications. a) Using the calculator: x 98.2 km/h b) i) a = = 165 b = = 275 ii) Back to graph paper. See below.
13 19. a) Here s the graph with markings. i) The median, in red, is at 100 cabs, and is $25. ii) The fare is $35 or less for 156 cabs, as seen in purple. b) The 40% of the cabs that travel the shortest distances also have the lowest fares. Forty percent of 200 is 80 cabs. As seen in green, this gives fares of $22 or less. Since the cabs charge 55 per km, that gives a distance of $22 $0.55 per km = 40 km. c) Ninety km at $0.55/km = $ On the graph in blue, $49.50 corresponds to a cumulative frequency of 185 cabs. This means that 15 cabs travel more than this distance. And 15 out of 300 is 5%. 20. If the median of the three numbers is 11, then b = 11. A mean of 9 gives a c = 9, so a + c = 16. A range 3 of 10 means c a = 10. Adding the two equations gives 2c = 26, so c = 13 and a = 3.
14 21. a) I think this is the last one I have to put on graph paper. Since no houses are sold for less than $0, that will be the lower limit of the first interval. b) The quartiles happen at 25 and 75 houses, which gives Q1 = $138,000 and Q3 = $245,000; the interquartile range is therefore $245,000 $138,000 = $107, c) a = = 7 b = = 6 d) Once again, to the calculator. In this case, a mean of 199 means $199,000. e) i) $350,000 is marked on the graph in purple. There appear to be 90 houses selling at less than $350K, so there are 10 houses that can be classified as DeLuxe. ii) This is once again a probability question a little past where we are in the study of the syllabus, so no worries if you don t know how to do it yet. Of the 10 DeLuxe houses, we know from the table that 6 sold for more than $400,000. So the chance that the first one selected was that expensive is 6. For the second 10 one, there are only 9 DeLuxes left, and only 5 selling for over $400K. So now the likelihood is at 5. Multiplying those gives the answer: = If you bother to check, you ll probably see that BoB s and my answers differ somewhat, and so do the shapes of our curves. I m drawing mine by hand; he s clearly not. That makes a difference. His technology is making the sections between the dots straight; in fact, there s no reason to believe that they will be, and curved is probably more realistic, although without the raw data, it s impossible to know if either shape really reflects the data accurately. Don t worry about it. I m not.
15 22. a) More drawing on a graph. The median is at 40 shells (notice that even though the y-axis goes up to 90, we are told that there are 80 shells, and the highest point on the curve has a y-value of 80), and the upper quartile is at 60. I would estimate the median as 20 mm and the upper quartile as 24 mm. b) IQR = = 10 mm 23. a) To find the values in the table, I ll have to find the cumulative frequencies at 40, 60, and 80, and then subtract the ones that preceded them. For instance, the number in the 40-to-60 range is the number less than 60 minus the number less than = = = 38. Those are the numbers in the table. I choose not to reproduce the whole thing. b) Forty percent 15 of 200 is 80 students. Looks like that s a FORTY PERCENT FAIL! Wow.
16 24. Yep, drawing on another curve. Some of these problems are starting to feel like overkill. 26. And a final graph. a) The median height will be at the 60th player; it is about 183 cm. b) The first quartile, at the 30th player, is 175 cm. The third quartile, at the 90th, is 189 cm. Therefore the interquartile range is = 14 cm. a) Forty marks matches with 100 candidates who scored that many marks or fewer, shown in red. b) The middle 50% is exactly the interquartile range. Therefore a must be Q1 and b is Q2. With 800 students, a will occur at a cumulative frequency of 200, and b at 600. Those are in blue. a = 55; b = So a, b, c, and d are in order from smallest to largest, with c = d. If the mode is 11, then c = d = 11. If the range is 8, then d a = 11 a = 8, and a = 3. Finally, if the mean is 8, then 3+ b = 8. That gives b = 32, and b = 7.
Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need.
Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need. For exams (MD1, MD2, and Final): You may bring one 8.5 by 11 sheet of
More informationSTAB22 section 1.3 and Chapter 1 exercises
STAB22 section 1.3 and Chapter 1 exercises 1.101 Go up and down two times the standard deviation from the mean. So 95% of scores will be between 572 (2)(51) = 470 and 572 + (2)(51) = 674. 1.102 Same idea
More informationSolutions for practice questions: Chapter 15, Probability Distributions If you find any errors, please let me know at
Solutions for practice questions: Chapter 15, Probability Distributions If you find any errors, please let me know at mailto:msfrisbie@pfrisbie.com. 1. Let X represent the savings of a resident; X ~ N(3000,
More informationA LEVEL MATHEMATICS ANSWERS AND MARKSCHEMES SUMMARY STATISTICS AND DIAGRAMS. 1. a) 45 B1 [1] b) 7 th value 37 M1 A1 [2]
1. a) 45 [1] b) 7 th value 37 [] n c) LQ : 4 = 3.5 4 th value so LQ = 5 3 n UQ : 4 = 9.75 10 th value so UQ = 45 IQR = 0 f.t. d) Median is closer to upper quartile Hence negative skew [] Page 1 . a) Orders
More informationChapter 6. y y. Standardizing with z-scores. Standardizing with z-scores (cont.)
Starter Ch. 6: A z-score Analysis Starter Ch. 6 Your Statistics teacher has announced that the lower of your two tests will be dropped. You got a 90 on test 1 and an 85 on test 2. You re all set to drop
More informationDescribing Data: One Quantitative Variable
STAT 250 Dr. Kari Lock Morgan The Big Picture Describing Data: One Quantitative Variable Population Sampling SECTIONS 2.2, 2.3 One quantitative variable (2.2, 2.3) Statistical Inference Sample Descriptive
More informationCategorical. A general name for non-numerical data; the data is separated into categories of some kind.
Chapter 5 Categorical A general name for non-numerical data; the data is separated into categories of some kind. Nominal data Categorical data with no implied order. Eg. Eye colours, favourite TV show,
More informationBiostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras
Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Lecture - 05 Normal Distribution So far we have looked at discrete distributions
More informationSTATISTICAL DISTRIBUTIONS AND THE CALCULATOR
STATISTICAL DISTRIBUTIONS AND THE CALCULATOR 1. Basic data sets a. Measures of Center - Mean ( ): average of all values. Characteristic: non-resistant is affected by skew and outliers. - Median: Either
More informationOverview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution
PSY 464 Advanced Experimental Design Describing and Exploring Data The Normal Distribution 1 Overview/Outline Questions-problems? Exploring/Describing data Organizing/summarizing data Graphical presentations
More informationDATA HANDLING Five-Number Summary
DATA HANDLING Five-Number Summary The five-number summary consists of the minimum and maximum values, the median, and the upper and lower quartiles. The minimum and the maximum are the smallest and greatest
More informationMath 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment
Math 2311 Bekki George bekki@math.uh.edu Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment Class webpage: http://www.math.uh.edu/~bekki/math2311.html Math 2311 Class
More informationTi 83/84. Descriptive Statistics for a List of Numbers
Ti 83/84 Descriptive Statistics for a List of Numbers Quiz scores in a (fictitious) class were 10.5, 13.5, 8, 12, 11.3, 9, 9.5, 5, 15, 2.5, 10.5, 7, 11.5, 10, and 10.5. It s hard to get much of a sense
More informationDescriptive Statistics (Devore Chapter One)
Descriptive Statistics (Devore Chapter One) 1016-345-01 Probability and Statistics for Engineers Winter 2010-2011 Contents 0 Perspective 1 1 Pictorial and Tabular Descriptions of Data 2 1.1 Stem-and-Leaf
More informationIOP 201-Q (Industrial Psychological Research) Tutorial 5
IOP 201-Q (Industrial Psychological Research) Tutorial 5 TRUE/FALSE [1 point each] Indicate whether the sentence or statement is true or false. 1. To establish a cause-and-effect relation between two variables,
More informationData that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc.
Chapter 8 Measures of Center Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc. Data that can only be integer
More informationPutting Things Together Part 2
Frequency Putting Things Together Part These exercise blend ideas from various graphs (histograms and boxplots), differing shapes of distributions, and values summarizing the data. Data for, and are in
More informationappstats5.notebook September 07, 2016 Chapter 5
Chapter 5 Describing Distributions Numerically Chapter 5 Objective: Students will be able to use statistics appropriate to the shape of the data distribution to compare of two or more different data sets.
More informationApplications of Data Dispersions
1 Applications of Data Dispersions Key Definitions Standard Deviation: The standard deviation shows how far away each value is from the mean on average. Z-Scores: The distance between the mean and a given
More information2 Exploring Univariate Data
2 Exploring Univariate Data A good picture is worth more than a thousand words! Having the data collected we examine them to get a feel for they main messages and any surprising features, before attempting
More informationBasic Procedure for Histograms
Basic Procedure for Histograms 1. Compute the range of observations (min. & max. value) 2. Choose an initial # of classes (most likely based on the range of values, try and find a number of classes that
More informationSince his score is positive, he s above average. Since his score is not close to zero, his score is unusual.
Chapter 06: The Standard Deviation as a Ruler and the Normal Model This is the worst chapter title ever! This chapter is about the most important random variable distribution of them all the normal distribution.
More informationA.REPRESENTATION OF DATA
A.REPRESENTATION OF DATA (a) GRAPHS : PART I Q: Why do we need a graph paper? Ans: You need graph paper to draw: (i) Histogram (ii) Cumulative Frequency Curve (iii) Frequency Polygon (iv) Box-and-Whisker
More informationStat 101 Exam 1 - Embers Important Formulas and Concepts 1
1 Chapter 1 1.1 Definitions Stat 101 Exam 1 - Embers Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2.
More informationECO155L19.doc 1 OKAY SO WHAT WE WANT TO DO IS WE WANT TO DISTINGUISH BETWEEN NOMINAL AND REAL GROSS DOMESTIC PRODUCT. WE SORT OF
ECO155L19.doc 1 OKAY SO WHAT WE WANT TO DO IS WE WANT TO DISTINGUISH BETWEEN NOMINAL AND REAL GROSS DOMESTIC PRODUCT. WE SORT OF GOT A LITTLE BIT OF A MATHEMATICAL CALCULATION TO GO THROUGH HERE. THESE
More informationWk 2 Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12)
Wk 2 Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics: - Measures of centrality (Mean, median, mode, trimmed mean) - Measures of spread (MAD, Standard deviation, variance) -
More informationStatistics (This summary is for chapters 17, 28, 29 and section G of chapter 19)
Statistics (This summary is for chapters 17, 28, 29 and section G of chapter 19) Mean, Median, Mode Mode: most common value Median: middle value (when the values are in order) Mean = total how many = x
More informationDATA SUMMARIZATION AND VISUALIZATION
APPENDIX DATA SUMMARIZATION AND VISUALIZATION PART 1 SUMMARIZATION 1: BUILDING BLOCKS OF DATA ANALYSIS 294 PART 2 PART 3 PART 4 VISUALIZATION: GRAPHS AND TABLES FOR SUMMARIZING AND ORGANIZING DATA 296
More informationClub Accounts - David Wilson Question 6.
Club Accounts - David Wilson. 2011 Question 6. Anyone familiar with Farm Accounts or Service Firms (notes for both topics are back on the webpage you found this on), will have no trouble with Club Accounts.
More informationNumerical Descriptive Measures. Measures of Center: Mean and Median
Steve Sawin Statistics Numerical Descriptive Measures Having seen the shape of a distribution by looking at the histogram, the two most obvious questions to ask about the specific distribution is where
More informationThe Standard Deviation as a Ruler and the Normal Model. Copyright 2009 Pearson Education, Inc.
The Standard Deviation as a Ruler and the Normal Mol Copyright 2009 Pearson Education, Inc. The trick in comparing very different-looking values is to use standard viations as our rulers. The standard
More information1 Describing Distributions with numbers
1 Describing Distributions with numbers Only for quantitative variables!! 1.1 Describing the center of a data set The mean of a set of numerical observation is the familiar arithmetic average. To write
More information3.1 Measures of Central Tendency
3.1 Measures of Central Tendency n Summation Notation x i or x Sum observation on the variable that appears to the right of the summation symbol. Example 1 Suppose the variable x i is used to represent
More informationTHE UNIVERSITY OF TEXAS AT AUSTIN Department of Information, Risk, and Operations Management
THE UNIVERSITY OF TEXAS AT AUSTIN Department of Information, Risk, and Operations Management BA 386T Tom Shively PROBABILITY CONCEPTS AND NORMAL DISTRIBUTIONS The fundamental idea underlying any statistical
More informationLecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1
Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 6 Normal Probability Distributions 6-1 Overview 6-2 The Standard Normal Distribution
More informationMA 1125 Lecture 05 - Measures of Spread. Wednesday, September 6, Objectives: Introduce variance, standard deviation, range.
MA 115 Lecture 05 - Measures of Spread Wednesday, September 6, 017 Objectives: Introduce variance, standard deviation, range. 1. Measures of Spread In Lecture 04, we looked at several measures of central
More informationStatistics (This summary is for chapters 18, 29 and section H of chapter 19)
Statistics (This summary is for chapters 18, 29 and section H of chapter 19) Mean, Median, Mode Mode: most common value Median: middle value (when the values are in order) Mean = total how many = x n =
More informationNOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS
NOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS A box plot is a pictorial representation of the data and can be used to get a good idea and a clear picture about the distribution of the data. It shows
More information2 2 In general, to find the median value of distribution, if there are n terms in the distribution the
THE MEDIAN TEMPERATURES MEDIAN AND CUMULATIVE FREQUENCY The median is the third type of statistical average you will use in his course. You met the other two, the mean and the mode in pack MS4. THE MEDIAN
More information2 DESCRIPTIVE STATISTICS
Chapter 2 Descriptive Statistics 47 2 DESCRIPTIVE STATISTICS Figure 2.1 When you have large amounts of data, you will need to organize it in a way that makes sense. These ballots from an election are rolled
More informationWeek 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.
Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics. Convergent validity: the degree to which results/evidence from different tests/sources, converge on the same conclusion.
More information6683/01 Edexcel GCE Statistics S1 Gold Level G2
Paper Reference(s) 6683/01 Edexcel GCE Statistics S1 Gold Level G Time: 1 hour 30 minutes Materials required for examination papers Mathematical Formulae (Green) Items included with question Nil Candidates
More informationFINALS REVIEW BELL RINGER. Simplify the following expressions without using your calculator. 1) 6 2/3 + 1/2 2) 2 * 3(1/2 3/5) 3) 5/ /2 4
FINALS REVIEW BELL RINGER Simplify the following expressions without using your calculator. 1) 6 2/3 + 1/2 2) 2 * 3(1/2 3/5) 3) 5/3 + 7 + 1/2 4 4) 3 + 4 ( 7) + 3 + 4 ( 2) 1) 36/6 4/6 + 3/6 32/6 + 3/6 35/6
More informationThe Binomial Distribution
The Binomial Distribution January 31, 2019 Contents The Binomial Distribution The Normal Approximation to the Binomial The Binomial Hypothesis Test Computing Binomial Probabilities in R 30 Problems The
More informationSTAT 157 HW1 Solutions
STAT 157 HW1 Solutions http://www.stat.ucla.edu/~dinov/courses_students.dir/10/spring/stats157.dir/ Problem 1. 1.a: (6 points) Determine the Relative Frequency and the Cumulative Relative Frequency (fill
More informationSTAT 113 Variability
STAT 113 Variability Colin Reimer Dawson Oberlin College September 14, 2017 1 / 48 Outline Last Time: Shape and Center Variability Boxplots and the IQR Variance and Standard Deviaton Transformations 2
More informationThe Normal Distribution
Stat 6 Introduction to Business Statistics I Spring 009 Professor: Dr. Petrutza Caragea Section A Tuesdays and Thursdays 9:300:50 a.m. Chapter, Section.3 The Normal Distribution Density Curves So far we
More informationData Analysis. BCF106 Fundamentals of Cost Analysis
Data Analysis BCF106 Fundamentals of Cost Analysis June 009 Chapter 5 Data Analysis 5.0 Introduction... 3 5.1 Terminology... 3 5. Measures of Central Tendency... 5 5.3 Measures of Dispersion... 7 5.4 Frequency
More informationChapter 5 The Standard Deviation as a Ruler and the Normal Model
Chapter 5 The Standard Deviation as a Ruler and the Normal Model 55 Chapter 5 The Standard Deviation as a Ruler and the Normal Model 1. Stats test. Nicole scored 65 points on the test. That is one standard
More informationExpected Value of a Random Variable
Knowledge Article: Probability and Statistics Expected Value of a Random Variable Expected Value of a Discrete Random Variable You're familiar with a simple mean, or average, of a set. The mean value of
More informationCH 5 Normal Probability Distributions Properties of the Normal Distribution
Properties of the Normal Distribution Example A friend that is always late. Let X represent the amount of minutes that pass from the moment you are suppose to meet your friend until the moment your friend
More information2CORE. Summarising numerical data: the median, range, IQR and box plots
C H A P T E R 2CORE Summarising numerical data: the median, range, IQR and box plots How can we describe a distribution with just one or two statistics? What is the median, how is it calculated and what
More informationCHAPTER 2 Describing Data: Numerical
CHAPTER Multiple-Choice Questions 1. A scatter plot can illustrate all of the following except: A) the median of each of the two variables B) the range of each of the two variables C) an indication of
More informationSome estimates of the height of the podium
Some estimates of the height of the podium 24 36 40 40 40 41 42 44 46 48 50 53 65 98 1 5 number summary Inter quartile range (IQR) range = max min 2 1.5 IQR outlier rule 3 make a boxplot 24 36 40 40 40
More informationThe Binomial Distribution
The Binomial Distribution January 31, 2018 Contents The Binomial Distribution The Normal Approximation to the Binomial The Binomial Hypothesis Test Computing Binomial Probabilities in R 30 Problems The
More informationSummarising Data. Summarising Data. Examples of Types of Data. Types of Data
Summarising Data Summarising Data Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester Today we will consider Different types of data Appropriate ways to summarise these data 17/10/2017
More informationNumerical Descriptions of Data
Numerical Descriptions of Data Measures of Center Mean x = x i n Excel: = average ( ) Weighted mean x = (x i w i ) w i x = data values x i = i th data value w i = weight of the i th data value Median =
More informationMeasures of Center. Mean. 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) Measure of Center. Notation. Mean
Measure of Center Measures of Center The value at the center or middle of a data set 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) 1 2 Mean Notation The measure of center obtained by adding the values
More informationCHAPTER 6. ' From the table the z value corresponding to this value Z = 1.96 or Z = 1.96 (d) P(Z >?) =
Solutions to End-of-Section and Chapter Review Problems 225 CHAPTER 6 6.1 (a) P(Z < 1.20) = 0.88493 P(Z > 1.25) = 1 0.89435 = 0.10565 P(1.25 < Z < 1.70) = 0.95543 0.89435 = 0.06108 (d) P(Z < 1.25) or Z
More informationLecture 16: Estimating Parameters (Confidence Interval Estimates of the Mean)
Statistics 16_est_parameters.pdf Michael Hallstone, Ph.D. hallston@hawaii.edu Lecture 16: Estimating Parameters (Confidence Interval Estimates of the Mean) Some Common Sense Assumptions for Interval Estimates
More informationThe Normal Distribution & Descriptive Statistics. Kin 304W Week 2: Jan 15, 2012
The Normal Distribution & Descriptive Statistics Kin 304W Week 2: Jan 15, 2012 1 Questionnaire Results I received 71 completed questionnaires. Thank you! Are you nervous about scientific writing? You re
More informationFrequency Distribution and Summary Statistics
Frequency Distribution and Summary Statistics Dongmei Li Department of Public Health Sciences Office of Public Health Studies University of Hawai i at Mānoa Outline 1. Stemplot 2. Frequency table 3. Summary
More informationMeasures of Dispersion (Range, standard deviation, standard error) Introduction
Measures of Dispersion (Range, standard deviation, standard error) Introduction We have already learnt that frequency distribution table gives a rough idea of the distribution of the variables in a sample
More informationExample: Histogram for US household incomes from 2015 Table:
1 Example: Histogram for US household incomes from 2015 Table: Income level Relative frequency $0 - $14,999 11.6% $15,000 - $24,999 10.5% $25,000 - $34,999 10% $35,000 - $49,999 12.7% $50,000 - $74,999
More informationChapter 4. The Normal Distribution
Chapter 4 The Normal Distribution 1 Chapter 4 Overview Introduction 4-1 Normal Distributions 4-2 Applications of the Normal Distribution 4-3 The Central Limit Theorem 4-4 The Normal Approximation to the
More informationMBEJ 1023 Dr. Mehdi Moeinaddini Dept. of Urban & Regional Planning Faculty of Built Environment
MBEJ 1023 Planning Analytical Methods Dr. Mehdi Moeinaddini Dept. of Urban & Regional Planning Faculty of Built Environment Contents What is statistics? Population and Sample Descriptive Statistics Inferential
More informationMultiple regression - a brief introduction
Multiple regression - a brief introduction Multiple regression is an extension to regular (simple) regression. Instead of one X, we now have several. Suppose, for example, that you are trying to predict
More informationMath 227 Elementary Statistics. Bluman 5 th edition
Math 227 Elementary Statistics Bluman 5 th edition CHAPTER 6 The Normal Distribution 2 Objectives Identify distributions as symmetrical or skewed. Identify the properties of the normal distribution. Find
More information4.1 Probability Distributions
Probability and Statistics Mrs. Leahy Chapter 4: Discrete Probability Distribution ALWAYS KEEP IN MIND: The Probability of an event is ALWAYS between: and!!!! 4.1 Probability Distributions Random Variables
More informationEdexcel past paper questions
Edexcel past paper questions Statistics 1 Chapters 2-4 (Discrete) Statistics 1 Chapters 2-4 (Discrete) Page 1 Stem and leaf diagram Stem-and-leaf diagrams are used to represent data in its original form.
More informationECON 214 Elements of Statistics for Economists 2016/2017
ECON 214 Elements of Statistics for Economists 2016/2017 Topic The Normal Distribution Lecturer: Dr. Bernardin Senadza, Dept. of Economics bsenadza@ug.edu.gh College of Education School of Continuing and
More informationSymmetric Game. In animal behaviour a typical realization involves two parents balancing their individual investment in the common
Symmetric Game Consider the following -person game. Each player has a strategy which is a number x (0 x 1), thought of as the player s contribution to the common good. The net payoff to a player playing
More informationIB Interview Guide: Case Study Exercises Three-Statement Modeling Case (30 Minutes)
IB Interview Guide: Case Study Exercises Three-Statement Modeling Case (30 Minutes) Hello, and welcome to our first sample case study. This is a three-statement modeling case study and we're using this
More informationLecture 1: Review and Exploratory Data Analysis (EDA)
Lecture 1: Review and Exploratory Data Analysis (EDA) Ani Manichaikul amanicha@jhsph.edu 16 April 2007 1 / 40 Course Information I Office hours For questions and help When? I ll announce this tomorrow
More informationLecture 2 Describing Data
Lecture 2 Describing Data Thais Paiva STA 111 - Summer 2013 Term II July 2, 2013 Lecture Plan 1 Types of data 2 Describing the data with plots 3 Summary statistics for central tendency and spread 4 Histograms
More informationHow Do You Calculate Cash Flow in Real Life for a Real Company?
How Do You Calculate Cash Flow in Real Life for a Real Company? Hello and welcome to our second lesson in our free tutorial series on how to calculate free cash flow and create a DCF analysis for Jazz
More informationHow Wealthy Are Europeans?
How Wealthy Are Europeans? Grades: 7, 8, 11, 12 (course specific) Description: Organization of data of to examine measures of spread and measures of central tendency in examination of Gross Domestic Product
More informationEdexcel past paper questions
Edexcel past paper questions Statistics 1 Chapters 2-4 (Continuous) S1 Chapters 2-4 Page 1 S1 Chapters 2-4 Page 2 S1 Chapters 2-4 Page 3 S1 Chapters 2-4 Page 4 Histograms When you are asked to draw a histogram
More informationWe use probability distributions to represent the distribution of a discrete random variable.
Now we focus on discrete random variables. We will look at these in general, including calculating the mean and standard deviation. Then we will look more in depth at binomial random variables which are
More information5.1 Mean, Median, & Mode
5.1 Mean, Median, & Mode definitions Mean: Median: Mode: Example 1 The Blue Jays score these amounts of runs in their last 9 games: 4, 7, 2, 4, 10, 5, 6, 7, 7 Find the mean, median, and mode: Example 2
More informationDot Plot: A graph for displaying a set of data. Each numerical value is represented by a dot placed above a horizontal number line.
Introduction We continue our study of descriptive statistics with measures of dispersion, such as dot plots, stem and leaf displays, quartiles, percentiles, and box plots. Dot plots, a stem-and-leaf display,
More informationECON 214 Elements of Statistics for Economists
ECON 214 Elements of Statistics for Economists Session 7 The Normal Distribution Part 1 Lecturer: Dr. Bernardin Senadza, Dept. of Economics Contact Information: bsenadza@ug.edu.gh College of Education
More informationLecture Week 4 Inspecting Data: Distributions
Lecture Week 4 Inspecting Data: Distributions Introduction to Research Methods & Statistics 2013 2014 Hemmo Smit So next week No lecture & workgroups But Practice Test on-line (BB) Enter data for your
More informationMisleading Graphs. Examples Compare unlike quantities Truncate the y-axis Improper scaling Chart Junk Impossible to interpret
Misleading Graphs Examples Compare unlike quantities Truncate the y-axis Improper scaling Chart Junk Impossible to interpret 1 Pretty Bleak Picture Reported AIDS cases 2 But Wait..! 3 Turk Incorporated
More informationChapter 5 Normal Probability Distributions
Chapter 5 Normal Probability Distributions Section 5-1 Introduction to Normal Distributions and the Standard Normal Distribution A The normal distribution is the most important of the continuous probability
More information1 Exercise One. 1.1 Calculate the mean ROI. Note that the data is not grouped! Below you find the raw data in tabular form:
1 Exercise One Note that the data is not grouped! 1.1 Calculate the mean ROI Below you find the raw data in tabular form: Obs Data 1 18.5 2 18.6 3 17.4 4 12.2 5 19.7 6 5.6 7 7.7 8 9.8 9 19.9 10 9.9 11
More informationChapter 5: Summarizing Data: Measures of Variation
Chapter 5: Introduction One aspect of most sets of data is that the values are not all alike; indeed, the extent to which they are unalike, or vary among themselves, is of basic importance in statistics.
More informationChapter 6. The Normal Probability Distributions
Chapter 6 The Normal Probability Distributions 1 Chapter 6 Overview Introduction 6-1 Normal Probability Distributions 6-2 The Standard Normal Distribution 6-3 Applications of the Normal Distribution 6-5
More informationToday s plan: Section 4.1.4: Dispersion: Five-Number summary and Standard Deviation.
1 Today s plan: Section 4.1.4: Dispersion: Five-Number summary and Standard Deviation. 2 Once we know the central location of a data set, we want to know how close things are to the center. 2 Once we know
More informationData Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 10 (MWF) Checking for normality of the data using the QQplot Suhasini Subba Rao Review of previous
More information5.1 Personal Probability
5. Probability Value Page 1 5.1 Personal Probability Although we think probability is something that is confined to math class, in the form of personal probability it is something we use to make decisions
More informationElementary Statistics
Chapter 7 Estimation Goal: To become familiar with how to use Excel 2010 for Estimation of Means. There is one Stat Tool in Excel that is used with estimation of means, T.INV.2T. Open Excel and click on
More informationLink full download:
- Descriptive Statistics: Tabular and Graphical Method Chapter 02 Essentials of Business Statistics 5th Edition by Bruce L Bowerman Professor, Richard T O Connell Professor, Emily S. Murphree and J. Burdeane
More information6.2.1 Linear Transformations
6.2.1 Linear Transformations In Chapter 2, we studied the effects of transformations on the shape, center, and spread of a distribution of data. Recall what we discovered: 1. Adding (or subtracting) a
More informationSome Characteristics of Data
Some Characteristics of Data Not all data is the same, and depending on some characteristics of a particular dataset, there are some limitations as to what can and cannot be done with that data. Some key
More informationAP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE
AP STATISTICS Name: FALL SEMESTSER FINAL EXAM STUDY GUIDE Period: *Go over Vocabulary Notecards! *This is not a comprehensive review you still should look over your past notes, homework/practice, Quizzes,
More informationNOTES: Chapter 4 Describing Data
NOTES: Chapter 4 Describing Data Intro to Statistics COLYER Spring 2017 Student Name: Page 2 Section 4.1 ~ What is Average? Objective: In this section you will understand the difference between the three
More informationData Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 10 (MWF) Checking for normality of the data using the QQplot Suhasini Subba Rao Checking for
More informationThe probability of having a very tall person in our sample. We look to see how this random variable is distributed.
Distributions We're doing things a bit differently than in the text (it's very similar to BIOL 214/312 if you've had either of those courses). 1. What are distributions? When we look at a random variable,
More information9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives
Basic Statistics for the Healthcare Professional 1 F R A N K C O H E N, M B B, M P A D I R E C T O R O F A N A L Y T I C S D O C T O R S M A N A G E M E N T, LLC Purpose of Statistic 2 Provide a numerical
More information