Strategy Cost per book No repair 0 Restoration 100 Microfilming 200 Full repair 400

Similar documents
Math 227 Elementary Statistics. Bluman 5 th edition

The Normal Probability Distribution

AMS7: WEEK 4. CLASS 3

ECON 214 Elements of Statistics for Economists 2016/2017

. 13. The maximum error (margin of error) of the estimate for μ (based on known σ) is:

In a binomial experiment of n trials, where p = probability of success and q = probability of failure. mean variance standard deviation

No, because np = 100(0.02) = 2. The value of np must be greater than or equal to 5 to use the normal approximation.

ECON 214 Elements of Statistics for Economists

CH 5 Normal Probability Distributions Properties of the Normal Distribution

Chapter 8 Homework Solutions Compiled by Joe Kahlig. speed(x) freq 25 x < x < x < x < x < x < 55 5

Lecture 6: Chapter 6

Consider the following examples: ex: let X = tossing a coin three times and counting the number of heads

Section 6.5. The Central Limit Theorem

Example - Let X be the number of boys in a 4 child family. Find the probability distribution table:

Theoretical Foundations

Fall 2011 Exam Score: /75. Exam 3

2011 Pearson Education, Inc

Week 7. Texas A& M University. Department of Mathematics Texas A& M University, College Station Section 3.2, 3.3 and 3.4

ECO220Y Continuous Probability Distributions: Normal Readings: Chapter 9, section 9.10

Section Distributions of Random Variables

Chapter 6. The Normal Probability Distributions

7 THE CENTRAL LIMIT THEOREM

8.2 The Standard Deviation as a Ruler Chapter 8 The Normal and Other Continuous Distributions 8-1

= 0.35 (or ˆp = We have 20 independent trials, each with probability of success (heads) equal to 0.5, so X has a B(20, 0.5) distribution.

Name PID Section # (enrolled)

Section Distributions of Random Variables

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Part V - Chance Variability

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

Section Introduction to Normal Distributions

Section 7.5 The Normal Distribution. Section 7.6 Application of the Normal Distribution

Example. Chapter 8 Probability Distributions and Statistics Section 8.1 Distributions of Random Variables

Lecture 8. The Binomial Distribution. Binomial Distribution. Binomial Distribution. Probability Distributions: Normal and Binomial

BIOL The Normal Distribution and the Central Limit Theorem

The graph of a normal curve is symmetric with respect to the line x = µ, and has points of

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Statistics for IT Managers

NORMAL RANDOM VARIABLES (Normal or gaussian distribution)

MTH 245: Mathematics for Management, Life, and Social Sciences

Chapter 4. The Normal Distribution

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

Chapter 8. Variables. Copyright 2004 Brooks/Cole, a division of Thomson Learning, Inc.

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment

MidTerm 1) Find the following (round off to one decimal place):

Introduction to Statistics I

MATH 118 Class Notes For Chapter 5 By: Maan Omran

Section Random Variables and Histograms

Midterm Exam III Review

Lecture 23. STAT 225 Introduction to Probability Models April 4, Whitney Huang Purdue University. Normal approximation to Binomial

Making Sense of Cents

The Normal Distribution

Unit2: Probabilityanddistributions. 3. Normal and binomial distributions

AP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE

Example - Let X be the number of boys in a 4 child family. Find the probability distribution table:

STA258H5. Al Nosedal and Alison Weir. Winter Al Nosedal and Alison Weir STA258H5 Winter / 41

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

Examples of continuous probability distributions: The normal and standard normal

Homework: Due Wed, Feb 20 th. Chapter 8, # 60a + 62a (count together as 1), 74, 82

Exam II Math 1342 Capters 3-5 HCCS. Name

Statistics and Probability

Chapter 9 & 10. Multiple Choice.

Honors Statistics. Daily Agenda

These Statistics NOTES Belong to:

Name PID Section # (enrolled)

Math 227 Practice Test 2 Sec Name

Lecture 3. Sampling distributions. Counts, Proportions, and sample mean.

Density curves. (James Madison University) February 4, / 20

Unit 04 Review. Probability Rules

Homework: Due Wed, Nov 3 rd Chapter 8, # 48a, 55c and 56 (count as 1), 67a

Chapter 15: Sampling distributions

What type of distribution is this? tml

Introduction to Business Statistics QM 120 Chapter 6

A continuous random variable is one that can theoretically take on any value on some line interval. We use f ( x)

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Chapter Six Probability Distributions

Probability & Sampling The Practice of Statistics 4e Mostly Chpts 5 7

MTH 245: Mathematics for Management, Life, and Social Sciences

Review of the Topics for Midterm I

Using the Central Limit Theorem It is important for you to understand when to use the CLT. If you are being asked to find the probability of the

continuous rv Note for a legitimate pdf, we have f (x) 0 and f (x)dx = 1. For a continuous rv, P(X = c) = c f (x)dx = 0, hence

Value (x) probability Example A-2: Construct a histogram for population Ψ.

Chapter Seven. The Normal Distribution

Confidence Intervals: Review

Section3-2: Measures of Center

Chapter 6: Random Variables

Lecture 9. Probability Distributions. Outline. Outline

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

IOP 201-Q (Industrial Psychological Research) Tutorial 5

Statistics 511 Supplemental Materials

Normal Distribution. Notes. Normal Distribution. Standard Normal. Sums of Normal Random Variables. Normal. approximation of Binomial.

Probability Distribution Unit Review

Math 14 Lecture Notes Ch The Normal Approximation to the Binomial Distribution. P (X ) = nc X p X q n X =

Lecture 9. Probability Distributions

CHAPTER 5 Sampling Distributions

7. For the table that follows, answer the following questions: x y 1-1/4 2-1/2 3-3/4 4

Chapter 5. Continuous Random Variables and Probability Distributions. 5.1 Continuous Random Variables

A.REPRESENTATION OF DATA

Section The Sampling Distribution of a Sample Mean

Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need.

STUDY SET 2. Continuous Probability Distributions. ANSWER: Without continuity correction P(X>10) = P(Z>-0.66) =

Transcription:

Coverage of test 1 1. Sampling issues definition of population and sample, purposes of sampling, types of bias in sampling 2. Data summary and presentation frequency distribution, histogram, stem and leaf display, boxplot; sample mean, median, mode, midrange, range, interquartile range, midhinge, median absolute deviation, variance, standard deviation; appropriateness of statistics for different types of data 3. Probability types of probability, simple probability calculations, basic properties of probabilities, conditional probability, independence 4. Random variables definition, probability distribution, mean (expected value), variance, standard deviation 5. Specific random variables (a) Binomial distribution, mean, variance (b) Normal calculating probabilities (c) Normal approximation to Binomial 6. Central Limit Theorem Note: You might find it useful to study for the test by taking one of the tests from previous years under actual test conditions. Note: You should not view the list above as exhaustive. Everything that we have discussed in class, or has been in a handout, is fair game to appear on the test. Practice problems 1. One of the most serious problems facing research libraries is the preservation of the materials that comprise their collections. This is a particularly critical problem in large libraries, where the age and size of the collections make evaluation and corrective action difficult. When faced with a book to be examined, a preservation librarian has four preservation strategies to consider: no repair, restoration, microfilming, and full repair (the latter three being different kinds of repair strategies). These strategy categories are ordered from least to most difficult technically. The following table gives the cost in dollars associated with each strategy per book: Strategy Cost per book No repair 0 Restoration 100 Microfilming 200 Full repair 400 The preservation librarian at a large university library system must decide which library building will undergo a preservation program in the upcoming academic year. Preliminary analysis has yielded the following probabilities of the need for each type of strategy for a randomly selected book in each library: Strategy Main stacks Business school library No repair.6.7 Restoration.2 0 Microfilming.1.1 Full repair.1.2 (a) Her budget advisor suggests choosing the library building with smaller mean preservation cost per book examined. Based on this recommendation, which library should she choose? (b) The assistant preservation librarian suggests choosing the library with smaller probability of having to do any kind of repair. Based on this recommendation, which library should she choose? (c) It occurs to her that she needn t focus on only one library; rather, she can choose a mixture of books from each library. She has a budget of $95 per book examined to spend on preservation. Let p be the proportion of the total books examined that come from the main stacks (so 1 p is the c 2017, Jeffrey S. Simonoff 1

proportion that come from the Business school library). What should p be so that the expected cost per book meets her spending budget exactly? 2. The story Tough Times in Japan, which appeared in the February 9, 2003 issue of Parade magazine, contained various statistics purporting to illustrate tough economic times in that country. The article included the following sentence: A recent survey by Japan s Ministry of Health, Labor and Welfare indicates that as many as 60% of all households now have annual incomes below the national average of $52,339. Explain why it would not be surprising to find that more than half of all households have annual incomes below the national average, even if a country was not having tough economic times. 3. In December 1993, the U.S. Census Bureau issued a report on college enrollments in the United States. The report stated that 56% of all college students in 1992 were women. Further, 17% of all college students were aged 35 or older. Overall, 11% of all college students were women aged 35 or older. (a) You meet a person on campus who is a member of an organization composed of people who are all under the age of 35. What is the probability that this person is a woman? (b) Are the events being a woman and being aged 35 years or older independent? 4. Consider two investments that you are looking at for your portfolio. The one-month return for each is normally distributed, with a mean µ =.01 (that is, a 1% return). Investment A s return has standard deviation σ A =.005, while investment B s return has standard deviation σ B =.015. What is the probability that investment A s return next month will be positive? Is investment B s probability of positive return higher, lower, or the same? Which investment would a rational investor prefer? 5. During World War II many economists, mathematicians, and statisticians were members of Columbia University s Statistics Research Group, which did high level consulting for the armed services. As part of the group s work, statistician Abraham Wald was approached to provide guidance on where to place armor on airplanes in order to protect them (the armor was heavy, so it couldn t be placed over the entire aircraft). The aircraft engineers had taken a large random sample of aircraft that had returned from military action, and had developed a mapping of where (and how often) bullet holes were found. If you were Wald, what would your advice be about where to put the armor? Why? 6. Say the SAT verbal exam is graded along a strict curve, such that the scores in the population of students have mean 500 and standard deviation 100. If a random sample of 20 high school seniors is taken, what is the probability that the mean of their test scores exceeds 530? What assumptions are you making here? 7. Alcohol related problems on college campuses are widespread, even though most college students are under the national drinking age of 21. A study funded by the Harvard School of Public Health, released on November 4, 1995, examined some issues relating to college alcohol abuse on campuses in the northeastern United States. The study stated that 50% of college men are binge drinkers, where a binge drinking man is defined as one who consumed five drinks one after the other at least once in the previous two weeks. (a) Consider 10 randomly selected northeastern U.S. college men. What is the exact probability that at least 9 of these men are binge drinkers? (b) The applicants for a position include 100 northeastern U.S. college men. What is the probability that at least 35 of them are binge drinkers? An approximate answer is good enough here. c 2017, Jeffrey S. Simonoff 2

Answers to practice problems 1. Let C be the preservation cost of a book. (a) The mean preservation cost per book examined equals Thus, for the main stacks, E(C) = while for the Business school library, strategies (Cost of strategy)p(strategy). E(C) = (0)(.6) + (100)(.2) + (200)(.1) + (400)(.1) = 80, E(C) = (0)(.7) + (100)(0) + (200)(.1) + (400)(.2) = 100 Thus, the main stacks building has a lower mean preservation cost per book examined. (b) Since P(any repair) = 1 P(no repair), the values are 1.6 =.4 for the main stacks and 1.7 =.3 for the Business school library. Thus, the Business school library has lower probability of making any repair. (c) By (a), the amount spent per book examined would be (80)(p) + (100)(1 p). So, to match this to the spending budget, solve for p: 80p + 100 100p = 95 20p = 5 p =.25 So, to meet the budget, choose 25% of the books from the main stacks, and 75% from the Business school library. For many years, very little was known about the actual condition of books in the world s research libraries, although a good deal of folklore was bandied about. The first large scale, statistically valid, survey of book condition was undertaken at Yale University in the late 1970 s and early 1980 s. For a discussion of the results of this survey, and the issues raised during the study, see the 1985 paper The Yale Survey: A Large Scale Study of Book Deterioration in the Yale University Library, by Gay Walker, Jane Greenfield, John Fox and Jeffrey S. Simonoff, in volume 46 of the journal College and Research Libraries, pages 111 132. 2. Incomes are typically long right-tailed; for such data, the median is usually less than the mean, so more than half of the data will almost invariably fall below the mean. 3. Let W represent a randomly selected student being a woman, and T represent being aged 35 or over. We are given that P(W) =.56, P(T) =.17, and P(W and T) =.11. We can use these to construct a hypothetical 100,000 table to lay out all of the relevant probabilities: Age 35 and older (T) Under age 35 (T ) Woman (W) 11000 45000 56000 Man (W ) 6000 38000 44000 17000 83000 100,000 c 2017, Jeffrey S. Simonoff 3

(a) P(W T ) = P(W and T ) P(T ) = 45000 83000 =.542 (b) No, the two events are not independent, since P(W T ) =.542 P(W) =.56; equivalently, P(W and T) =.11 P(W)P(T) =.095. Being under the age of 35 decreases the probability of a randomly selected student being a woman. 4. Let A be the return of investment A. We re told that A N(.01,.005 2 ), so ( P(A > 0) = P Z > 0.01 ) = P(Z > 2).005 = 1 P(Z 2) = 1.0228 =.9772. The corresponding probability for investment B will be lower, since the probability is more spread out in both directions. You could have calculated it (it s.7475), but that s not necessary; all you need to see it is a picture like this: Density 0 20 40 60 80 A B 0.02 0.00 0.02 0.04 Return Thus, a rational investor would prefer investment A. Since it has the same expected return as investment B, with less risk (smaller standard deviation), this is just what we would expect. Note also that this simple example demonstrates why modeling the volatility of investments well is a worthy goal: if we were smarter than other investors regarding volatility, we could construct options with positive expected return by going short or long as appropriate. 5. It s tempting to say that the armor should be put in the places with the most bullet holes, but Wald was smart enough to realize that that is not correct he suggested that they put the armor in the places with the fewest bullet holes. Why? Because if we assume that bullet holes are roughly evenly c 2017, Jeffrey S. Simonoff 4

distributed over the airplane (a not unreasonable assumption), then a lack of bullet holes means that airplanes that were hit in those places never made it back from the military action. That is, this is a biased sample, with airplanes that are more seriously damaged less likely to be in the sample. 6. By the Central Limit Theorem, X N(µ, σ 2 /n). So, P( X > 530) = P ( Z > = P(Z > 1.34) =.0901 ) 530 500 100/ 20 We are assuming that the Central Limit Theorem is valid here. Since scores are probably reasonably symmetric, a sample of size 20 is probably large enough to make this a reasonable assumption. 7. The number of binge drinkers B is distributed as a Binomial random variable with n = 100 and p =.5. (a) We want to calculate an exact Binomial probability, so we use the Binomial probability function, P(X = k) = where n = 10 and p =.5. In this case, we want k=9 ( ) n p k (1 p) n k, k 10 ( ) 10 P(B 9) = P(B = 9) + P(B = 10) = (.5) k (.5) 10 k =.009766 +.000977 =.01074. k (b) Now B Bin(100,.5). We use a normal approximation to the binomial to estimate this probability. This requires calculating µ = np = 50 and σ 2 = np(1 p) = 25. Then, using the continuity correction, we have ( ) 34.5 50 P(B 35) P Z > 25 = P(Z > 3.1) = 1 P(Z < 3.1) = 1.0009 =.9991.. c 2017, Jeffrey S. Simonoff 5