Final/Exam #3 Form B - Statistics 211 (Fall 1999)

Final/Exam #3 Form B - Statistics 211 (Fall 1999) This test consists of nine numbered pages. Make sure you have all 9 pages. It is your responsibility to inform me if a page is missing!!! You have at least 100 minutes to complete this test. Make sure you fill out and bubble in the following items on the scantron sheet: Last Name, First Name, MI. Dept (STAT), Course No (211), Section (504). Social Security Number (enter your Student ID). Test Form (A or B). Enter the exam number, sign and date the form. You will loose credit if you fail to fill out your scantron form correctly. You may use crib sheets 1, 2 and 3 and all appropriate tables and a calculator. If I provide partial results assume they are correct and use them even if they are not. If there is no correct answer or if multiple answers are correct, select the best answer. There is no penalty to wrong answers... so guess if you do not know an answer. It is your responsibility to look at the overhead or blackboard about every 15 minutes and to incorporate any relevant information into your test. You can keep this exam! 1

Fill out your scantron sheet properly. See the header page of this exam for instructions. Chapter 8 [1] A type I error is: a) Reject H 0 when H 0 is false. b) Reject H 0 when H 0 is true. c) Fail to reject H 0 when H 0 is false. d) Fail to reject H 0 when H 0 is true. e) not relevant to a test. [2] A statistical hypothesis is a claim about the value of one or more... a) statistics. b) parameter estimates. c) tests. d) confidence intervals. e) population parameters. [3] For a test α =.05 and β =.10. What is the probability we reject H 0? a).05 b).1 c).9 d).95 e) We cannot know this without the data! [4] The desired percentage of SiO 2 in a certain type of aluminous cement is 5.5. We want to test whether the true average percentage is 5.5 for a particular production facility. Suppose the percentage of SiO 2 in a sample is normally distributed with σ =.3. What sample size n is required to satisfy α =.01 and β(5.6) =.01? a) n = 15 b) n = 20 c) n = 194 d) n = 217 e) n = 721 [5] The one sample test for µ when data is Normal and σ is known is: Z = x µ 0 s/. When our n sample size is large enough (say n 30) we say it is no longer necessary that the data is Normal. Why? a) When we have large samples, x is a better estimate if µ! b) As the sample size increases, the variance of x decreases! c) When we have a large sample size, s is an accurate estimate of σ! d) When n 30 the CLT no longer holds allowing us to ignore with the Normality assumption. e) The CLT guarantees that x is Normal, even if the data is not. [6] Suppose a manufacturer claims their VHS tapes can hold 120 minutes of programming at SP mode. You believe they are shorter, so you test: In this example a type II error would be: H 0 : µ = 120 H A : µ < 120 a) You claim the tapes are shorter than 120 minutes when in reality they are not. 2

b) You claim the tapes are longer than 120 minutes when in reality they are not. c) We do not conclude the tapes are shorter than 120 minutes when in reality they are. d) We do not conclude the tapes are longer than 120 minutes when in reality they are. e) α. [7] For the above example, we measure the length of 10 video tapes to find that the average length is 119 minutes. Why do we need to test if µ < 120 is we know that the average length of the 10 tapes is 119? Clearly 119 < 120... right? a) It is not necessary to test since the average of 119 is less than 120! b) It is not necessary to test because a test deals with statistics not parameters. c) It is not necessary to test because a test deals with parameters not statistics. d) Knowing x = 119 does not guarantee that µ < 120. e) Knowing µ = 119 does not guarantee that µ < 120. [8] Many consumers are turning to generics as a way of reducing the cost f prescription medications. A paper gives the results of a survey of 102 doctors. Only 47 of those surveyed knew the generic name for the drug methadone. Does this provide strong evidence for concluding that fewer than half the physicians know the generic name of methadone? The appropriate hypothesis test is: a) H 0 : p =.5 vs. H a : p <.5 b) H 0 : p =.5 vs. H a : p >.5 c) H 0 : µ =.5 vs. H a : µ <.5 d) H 0 : µ =.5 vs. H a : µ >.5 e) H 0 : µ =.5 vs. H a : µ.5 [9] Statistical significance a) does imply practical significance. b) is the same as practical significance. c) does not imply practical significance. d) happens when α < p value. e) only matters if the variance is very small. [10] Our hypothesis test results in a p-value of.01234. In this case we would: a) Reject H 0 at α =.05, but fail to reject H 0 at α =.01. b) Reject H 0 at α =.01, but fail to reject H 0 at α =.05. c) Reject H 0 at α =.05, and reject H 0 at α =.01. d) Fail to reject H 0 at α =.05, and fail to reject H 0 at α =.01. e) Can t say it depends on the actual test statistics. [11] The city council in a large city has become concerned about the trend toward exclusion of renters with children in apartments within the city. The housing coordinator has decided to select a random sample of 125 apartments and determine for each whether children would be permitted. let p = the true proportion of apartments that prohibit children. If at α =.05 p >.75, the council will consider appropriate legislation. What is the probability of a type II error when p =.8? a).7642 b).4129 c).6480 d).9515 e).5596 3

[12] In the above city council problem a formal test of H 0 : p =.75 vs. H a : p >.75 results in a Z statistic of 1.70. The p-value of this test is: a).9554 b).0446 c).0892 d).0500 e).5000 Chapter 9 [13] Researchers concluded in 1979, that the ferritin concentration in elderly has a smaller variance than in younger adults. For a sample of 28 elderly men, the sample standard deviation of serum ferritin (mg/l) was s 1 = 52.6. For 26 young men, the sample standard deviation was s 2 = 84.2. Does this data support at α =.01 the conclusion of the researchers in 1979 as applied to men? a) We fail to reject H 0 and we cannot conclude variability of ferritin to be greater in young men then in elderly men. b) We reject H 0 and we cannot conclude variability of ferritin to be greater in young men then in elderly men. c) We fail to reject H 0 and we conclude variability of ferritin to be greater in young men then in elderly men. d) We reject H 0 and conclude variability of ferritin to be greater in young men then in elderly men. e) f =.394 therefore neither the null or alternative hypothesis apply. [14] Consider the accompanying data on breaking load (kg/25 mm width) for various fabrics in both and unabraded condition and an abraded condition. Eight different fabric types where tests. We are interested if knowing if the breaking load of fabrics in unabraded condition is larger than then breaking load of fabrics in abraded condition. That is, we are interested in: H 0 : µ D = 0 vs H a : µ d > 0 where µ d = µ 1 µ 2. Fabric 1 2 3 4 5 6 7 8 1) Unabraded 36.4 55.0 51.5 38.7 43.2 48.8 25.6 49.8 2) Abraded 28.5 20.0 46.0 34.5 36.5 52.5 26.5 46.5 Summary statistics: d = 7.25 sd = 11.8628 At α =.01 what is our conclusion? a) We reject H 0 and conclude the data does not indicate differences in breaking load for the two fabric conditions. b) We reject H 0 and conclude the data does indicate differences in breaking load for the two fabric conditions. c) We fail to reject H 0 and conclude the data does not indicate differences in breaking load for the two fabric conditions. d) We fail to reject H 0 and conclude the data does indicate differences in breaking load for the two fabric conditions. e) Irrelevant the data is not practically significant. [15] Tennis elbow is thought to be aggravated by the impact experienced when hitting the ball. A paper reports the force on the hand just after impact on a one-handed backhand drive for six advanced players and for eight intermediate players. Summary data appears below. We 4

are interested if force after impact is greater for advanced players than it is for intermediate players. 1) Advanced n = 6 x = 40.3 s = 11.3 2) Intermediate n = 8 x = 21.4 s = 8.3 The appropriate hypothesis for the above test is: a) H 0 : µ 1 = µ 2 vs H a : µ 1 µ 2 < 0 b) H 0 : µ 1 = µ 2 vs H a : µ 1 µ 2 > 0 c) H 0 : µ 1 = µ 2 vs H a : µ 1 µ 2 0 d) H 0 : µ d = 0 vs H a : µ d < 0 where µ d = µ 1 µ 2 e) H 0 : µ d = 0 vs H a : µ d > 0 where µ d = µ 1 µ 2 [16] The appropriate test for the above Tennis elbow problem is: a) The pooled t-test. b) The Smith-Satterthwaite test. c) The paired t-test. d) Test for two population proportions. e) Test for two population variances. [17] At the beginning of the semester we talked about the Salk Polio vaccine. In a large 1954 experiment 401,974 children where vaccinated. 201,229 where given the vaccine and 200,745 where given a placebo. 110 of the children who received a placebo got polio and 33 of those given the vaccine got polio. Was the vaccine effective? H 0 : p 1 p 2 = 0 H A : p 1 p 2 < 0 Vaccine: m = 201229 x = 33 ˆp 1 =.000164 Placebo: n = 200745 y = 110 ˆp 2 =.000548 The test statistic is Z = 6.46. What is the p-value of this test? What is the conclusion? a) less than.0001 the vaccine is not effective! b) less than.0001 the vaccine is effective! c) greater than.9999 the vaccine is not effective d) p-value cannot be calculated, but the vaccine is effective. e) p-value=.0228 the vaccine is effective! Chapter 10/12 [18] A builder is interested in whether location has an effect on the selling price of a new threebedroom home in the 2000 square foot range. To keep matters simple, let suppose there are only three areas in which new homes are being built and thus only three treatment levels. The builder samples the price of 3 houses for each area (treatment level) and conducts the test: H 0 : µ 1 = µ 2 = µ 3 H a : noth 0 Here is the partial ANOVA. What is the f statistic for this test? Source DF Sum of Squares Mean Square f Treatment 235.67 Error Total 469.67 5

a) f = 1.68 b) f = 0.33 c) f = 4.03 d) f = 3.02 e) f = 234.00 [19] In the above builder example the three different average house prices are (for the three different areas) are: x 1 = 90 x 2 = 100 x 3 = 105 We conduct a Tukey multiple comparison test where w = 7. Which conclusion below is the correct one? a) Areas 1 & 2 are the same, but they are significantly different from area 3. b) Areas 1 & 3 are the same, but they are significantly different from area 2. c) Areas 2 & 3 are the same, but they are significantly different from area 1. d) Areas 1 & 2 & 3 are all significantly different from each other. e) None of the areas 1 & 2 & 3 are significantly different from each other. Chapter 3-6 (selected sections) [20] Two desirable properties of a point estimate ˆθ are: a) Small bias and large variance. b) Large bias and large variance. c) Small bias and small variance. d) Large bias and small variance. e) Only the bias matters! Not the variance. [21] If GPA at Texas A&M is distributed X N(µ = 2, σ = 0.8) then what GPA must you have to be in the 75th percentile (select the closest answer)? a) 0.675 b) 1.28 c) 2.54 d) 3.00 e) 3.75 [22] The average number of hurricanes that hit the west cost is 0.5 per year. What is the probability more than 3 years pass without a hurricane? a) P (X 3) where X Poisson(λ = 0.5). b) P (X 3) where X Gamma(α = 1.5, β = 0.5). c) P (X 3) where X Gamma(α = 1.0, β = 1.0). d) P (X 3) where X Exponential(λ = 0.5). e) P (X 3) where X N(µ = 0.5, σ = 1). [23] If two variables X and Y are positively correlated this might mean: a) X causes Y. b) X and Y cause each other. c) X and Y could be caused by a third lurking variable. d) X and Y are related by chance. e) All of the above. [24] The Central Limit Theorem is important because it: 6

a) tells us the distribution of the data even if the distribution of the average is not known. b) tells us the distribution of the data even if the mean and variance of the data are not known. c) tells us the distribution of the average even if the distribution of the data is not known. d) allows us to solve probability questions about X (an individual data point). e) says that X has a binomial distribution if n is large. [25] The average number of rabits per acre in a 7 acre forest is estimated to be 4. Find the probability that 2 or fewer rabits are found on any given acre. a) P (X 2) where X Negative Binomial(r = 4, p = 2 7 ). b) P (X 1) where X Binomial(n = 4, p = 2 7 ). c).238 d).146 e).092 [26] When a baseball player hits.300, he gets a hit 30% of the times at bat. Typical major leaguers bat about 500 times a season and hit about.260. Assuming a hitter s successive tries are independent, what is the probability that a typical.260 player bats.200 or better? a) P (X.20) when X Binomial(n = 500, p =.26). b) P (X 100) when X Binomial(n = 500, p =.26). c) P (X.20) when X Binomial(n = 500, p =.20). d) P (X 100) when X Binomial(n = 500, p =.20). e) P (X 200) when X Negative Binomial(n = 500, p =.30). [27] The probability a student passes a class is 85%. The average class size is 90 students and the avergae GPA is 2.0. Assuming successive attempts are independent, and assuming 20% of the students hate the class, we want to know how many times a student needs to take class before she passes. The relevant distribution to answer this question is? a) X Negative Binomial(r = 2, p =.85). b) X Negative Binomial(r = 2, p =.20). c) X Geometric(p =.85). d) X Geometric(p =.20). e) X Binomial(n = 90, p =.85). Chapter 7 [28] You desire to construct a confidence interval of your estimate of average price of a six pack of Diet Coke in Bryan/College Station. How many store prices must you sample so that your 90% confidence interval is equal to or smaller than ±10 cents? You estimate that σ is 30 cents. a) 24 b) 25 c) 35 d) 97 e) 98 [29] Which of the following will not affect the width of the confidence interval? a) Alpha. b) Standard Deviation. c) Sample size. d) All of the above will affect the width. e) None of the above will affect the width. 7

[30] Which of the following properties is not part of the t distribution: a) Each t v curve is bell-shaped and centered on 0. b) Each t v curve is more spread out than the standard normal (Z) curve. c) As v increases, the spread of the corresponding t v curve decreases. d) As v, the sequence of t v curves approaches the standard normal curve (ie. t = Z). e) All of the above hold for the t distribution. [31] You need to know sigma to calculate the sample size required to construct the confidence interval of a given width. You do not know sigma, but you do know that the approximate minimum and maximum observations are 2 and 10 and that the data comes from an approximately Normal distribution. You rough estimate of σ is: a) 1 b) 2 c) 3 d) 4 e) 8 [32] The 1983 Tylenol poisoning episode and other similar incidents have focused attention on the desirability of packaging various commodities in a tamper-resistant manner. An 1983 article reports the results of a survey of consumer attitudes toward such packaging. Of the 270 consumers surveyed, 189 indicated that they would be willing to pay extra for the tamperresistant packaging. Construct a 95% confidence interval on the proportion of consumers willing to pay extra for the tamper-resistant packaging. a).7 ±.055 b).7 ±.065 c).7 ±.075 d).7 ±.085 e).7 ±.095 [33] The proportion of registered voters that would need to be sampled for a poll that desires an accuracy (margin of error) of ±1% is: a) 193 b) 1068 c) 4900 d) 9604 e) 38416 [34] Consider the following sample of fat content (in percentage) of n = 10 randomly selected hot dogs: 25.2, 21.3, 22.8, 17.0, 29.8, 21.0, 25.5, 16.0, 20.9, 19.5 x = 21.90 s = 4.134 Assuming Normality, construct a 95% prediction interval for the fat content of a single hot dog. a) 21.90 ± 2.56 b) 21.90 ± 2.96 c) 21.90 ± 4.82 d) 21.90 ± 8.50 e) 21.90 ± 9.81 8

[35] Consider the hot dog example above. If you ask: I wonder what the fat content of this hot dog I am eating is? the interval ypu need to construct is: a) a confidence interval on µ. b) a confidence interval on p. c) a prediction interval on µ. d) a prediction interval on p. e) a confidence interval on σ. [36] In the hot dog example above, we want to construct a 90% confidence interval on σ. The appropriate distribution critical value(s) (eg. Z, t, χ, F, etc.) are: a) 4.168, 14.648 b) 3.325, 16.919 c) 1.833 d) 1.645 e) 1.812 [37] Given a Normal distribution N(µ, σ 2 ), if ˆµ = 100 can we reasonably rule out µ = 100.1? a) Yes. b) No. c) Probably yes. d) Probably no. e) Can t say depends on the variance σ 2. [38] In this course, σ most often referred to: a) population standard deviation. b) population variance. c) sample variance. d) sample standard deviation. e) sample correlation. [39] What is the confidence level for the following confidence interval for µ from a Normal distribution: x ± 2.576 σ n a) 1% b) 0.5% c) 99% d) 99.5% e) 0.005 [40] A 95% confidence level for a given interval (for Normal data) means that a) there is a 95% chance this interval contains µ. b) there is a approximate 95% chance this interval contains µ. c) on the average, 95 out of 100 times we cannot know if the interval contains µ. d) on the average, 95% of all confidence intervals we construct will contains the true value of µ. e) the interval is a random interval with unknown fixed endpoints. 9

Solutions to Exam #3B Fall 1999 1.) b, e, c, d, e 6.) c, d, a, c, a 11.) c, b, d, c, b 16.) a, b, d, c, c 21.) c, d, e, c, c 26.) b, c, b, d, e 31.) b, a, d, e, c 36.) b, e, a, c, d 10