Chapter 5 Basic Probability

Similar documents
2017 Fall QMS102 Tip Sheet 2

CHAPTER 6 Random Variables

Econ 6900: Statistical Problems. Instructor: Yogesh Uppal

ECON 214 Elements of Statistics for Economists 2016/2017

The Binomial distribution

Chapter 6: Random Variables

Math 14 Lecture Notes Ch Mean

Part V - Chance Variability

Probability & Sampling The Practice of Statistics 4e Mostly Chpts 5 7

Random Variables. 6.1 Discrete and Continuous Random Variables. Probability Distribution. Discrete Random Variables. Chapter 6, Section 1

6.1 Discrete and Continuous Random Variables. 6.1A Discrete random Variables, Mean (Expected Value) of a Discrete Random Variable

Chapter 7. Random Variables

Chapter 4. Section 4.1 Objectives. Random Variables. Random Variables. Chapter 4: Probability Distributions

The binomial distribution

Marquette University MATH 1700 Class 8 Copyright 2018 by D.B. Rowe

ECON 214 Elements of Statistics for Economists 2016/2017

Chapter 5 Normal Probability Distributions

MAKING SENSE OF DATA Essentials series

Lesson 97 - Binomial Distributions IBHL2 - SANTOWSKI

Opening Exercise: Lesson 91 - Binomial Distributions IBHL2 - SANTOWSKI

2011 Pearson Education, Inc

Sec$on 6.1: Discrete and Con.nuous Random Variables. Tuesday, November 14 th, 2017

Random Variable: Definition

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

Section Introduction to Normal Distributions

Binomial population distribution X ~ B(

Statistics for Business and Economics: Random Variables (1)

Lecture 9. Probability Distributions

AP Statistics Chapter 6 - Random Variables

Chapter 7 1. Random Variables

Chapter 8 Solutions Page 1 of 15 CHAPTER 8 EXERCISE SOLUTIONS

Statistics for Managers Using Microsoft Excel 7 th Edition

Probability mass function; cumulative distribution function

Statistics 6 th Edition

Lecture 8. The Binomial Distribution. Binomial Distribution. Binomial Distribution. Probability Distributions: Normal and Binomial

The Binomial Probability Distribution

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

CH 5 Normal Probability Distributions Properties of the Normal Distribution

Lecture 9. Probability Distributions. Outline. Outline

Elementary Statistics Blue Book. The Normal Curve

6.3: The Binomial Model

The Normal Probability Distribution

12. THE BINOMIAL DISTRIBUTION

12. THE BINOMIAL DISTRIBUTION

CHAPTER 8. Confidence Interval Estimation Point and Interval Estimates

Chapter 6 Confidence Intervals

Probability Distributions for Discrete RV

Statistical Intervals. Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

+ Chapter 7. Random Variables. Chapter 7: Random Variables 2/26/2015. Transforming and Combining Random Variables

Chapter 6 Confidence Intervals Section 6-1 Confidence Intervals for the Mean (Large Samples) Estimating Population Parameters

5.4 Normal Approximation of the Binomial Distribution

Chapter 5. Discrete Probability Distributions. McGraw-Hill, Bluman, 7 th ed, Chapter 5 1

Statistical Methods for NLP LT 2202

STATISTICAL DISTRIBUTIONS AND THE CALCULATOR

Chapter 5. Sampling Distributions

Probability is the tool used for anticipating what the distribution of data should look like under a given model.

Statistical Intervals (One sample) (Chs )

Key Objectives. Module 2: The Logic of Statistical Inference. Z-scores. SGSB Workshop: Using Statistical Data to Make Decisions

CHAPTER 10: Introducing Probability

Chapter 7 presents the beginning of inferential statistics. The two major activities of inferential statistics are

6.5: THE NORMAL APPROXIMATION TO THE BINOMIAL AND

Statistical Tables Compiled by Alan J. Terry

5.4 Normal Approximation of the Binomial Distribution Lesson MDM4U Jensen

Diploma in Business Administration Part 2. Quantitative Methods. Examiner s Suggested Answers

Theoretical Foundations

Chapter 4 Continuous Random Variables and Probability Distributions

3.3-Measures of Variation

GOALS. Discrete Probability Distributions. A Distribution. What is a Probability Distribution? Probability for Dice Toss. A Probability Distribution

Normal Probability Distributions

Discrete Probability Distributions Chapter 6 Dr. Richard Jerz

Chapter 3 Discrete Random Variables and Probability Distributions

A probability distribution shows the possible outcomes of an experiment and the probability of each of these outcomes.

Refer to Ex 3-18 on page Record the info for Brand A in a column. Allow 3 adjacent other columns to be added. Do the same for Brand B.

STA 220H1F LEC0201. Week 7: More Probability: Discrete Random Variables

Chapter 6: Random Variables

Midterm Exam III Review

Chapter 6: Random Variables

Normal distribution. We say that a random variable X follows the normal distribution if the probability density function of X is given by

Discrete Probability Distribution

Discrete Probability Distributions

Discrete Probability Distributions

The normal distribution is a theoretical model derived mathematically and not empirically.

6. THE BINOMIAL DISTRIBUTION

Statistics (This summary is for chapters 17, 28, 29 and section G of chapter 19)

Stat 211 Week Five. The Binomial Distribution

Expected Value of a Random Variable

Chapter 6 Continuous Probability Distributions. Learning objectives

TOPIC: PROBABILITY DISTRIBUTIONS

HHH HHT HTH THH HTT THT TTH TTT

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Mathacle. PSet Stats, Concepts In Statistics Level Number Name: Date: Distribution Distribute in anyway but normal

A continuous random variable is one that can theoretically take on any value on some line interval. We use f ( x)

CHAPTER 5 Sampling Distributions

ECON 214 Elements of Statistics for Economists

These Statistics NOTES Belong to:

Measure of Variation

Math 227 Elementary Statistics. Bluman 5 th edition

Part 1 In which we meet the law of averages. The Law of Averages. The Expected Value & The Standard Error. Where Are We Going?

Counting Basics. Venn diagrams

Transcription:

Chapter 5 Basic Probability Probability is determining the probability that a particular event will occur. Probability of occurrence = / T where = the number of ways in which a particular event occurs and T = the total number of possible outcomes. Joint probability refers to the probability of an occurrence involving two or more events. The marginal probability of an event consists of a set of joint probabilities. Two events are mutually exclusive if both the events cannot occur simultaneously. Conditional probability refers to the probability of event A, given information about the occurrence of another event B. Example 1 (joint probability): Find the probability that a randomly selected household that purchased a big-screen television also purchased a plasma-screen television and a DVR. Purchased DVR Purchased Plasma Yes No Total Screen Plasma Screen 38 42 80 Not Plasma Screen 70 150 220 Total 108 192 300 P (Purchased Screen and DVR) = number of people that purchased a plasma screen and a DVR / total number of big-screen television purchasers from the table, we can see that 38 people purchased a plasma screen and a DVR. P (Purchased Screen and DVR) = 38 / 300 = 0.127 or 12.7%. Example 2 (marginal probability): Referring to example 1, find the probability of the households that purchased a plasma screen or a DVR. P (Plasma Screen or DVR) = P (Plasma Screen) + P (Purchased DVR) P (Plasma Screen and DVR) we have to remember to subtract the probability of those who purchased the plasma screen and the DVR. P (Purchased Screen or DVR = (80 / 100) + (108 / 300) (38/300) = 150 / 300 = 0.5 or 50%. Example 3 (conditional probability): Referring to example 1, find the probability of a household that purchased a DVR, given that the household purchased a plasma-screen television. P (Purchased DVR Purchased Plasma Screen) = number of people that purchased a plasma screen and DVR / number of people that purchased a plasma screen P (Purchased DVR Purchased Plasma Screen = 38 / 80 = 0.475 or 47.5% Another way of solving this question would be through using the formula P (B A) = P (A and B) / P (A) where A = number of people that purchased a plasma and B = number of people that purchased a DVR. P (B A) = (38 / 300) / (80 / 300) = 0.475 or 47.5%. Example 4 (multiplication rule): Referring to example 1, suppose three households are randomly selected from the 80 households. Find the probability that all three households purchased a DVR. P (A and B) = (38 / 80) x (37 / 79) x (36 / 78) = 0.1027 or 10.27%. Notice how both the numerator and denominator both decrease by one because after we select the first household, there are only 37 people left who purchased a DVR and 79 people left in the total sample.

Chapter 6 Discrete Probability Distributions Random variable: a variable whose value is determined by the outcome of a random experiment. Discrete random variable: a random variable whose possible values are discrete or individual values. Discrete probability distribution: a table that shows the probability associated with each possible value of a discrete random variable. Mean (µ) = sample (n) x probability (p) µ = np Variance (σ 2 ) = [n x p x (1-p)] Standard deviation (σ) = square root of [n x p x (1-p)] or the square root of the variance Example 1: Find the probability distribution and the expected value of tossing a fair coin three times. Let us first list the total possible outcomes or our sample space and remember the total of all the probabilities has to equal 1 or 100%. S = {HHH, HHT, HTH, THH, TTH, THT, HTT, TTT} and let = the number of tails. P() 0 1 / 8 = 0.125 1 3 / 8 = 0.375 2 3 / 8 = 0.375 3 1 / 8 = 0.125 Now that we have the probability distribution, we have to find the expected value, which we can do one of two ways. The first method is through multiplying by P() and adding all the sums together. µ = E() = (0)(0.125) + (1)(0.375) + (2)(0.375) + (3)(0.125) = 1.5 and the second method involves putting the data from into list 1 on your calculator and inputting the data from P() into list 2 and then calculating the mean through the CALC and 1-variable functions. Example 2: A company has determined that the probability of a unit being successful in Brampton is 3/4 and the annual profit in this case is $150,000, but if unsuccessful, there will be losses of $80,000. In Mississauga, the probability of attaining a profit of $240,000 is 1/2, but the potential losses amount to $48,000. Where should the company locate to maximize expected profit and which of the two locations is riskier? Brampton Mississauga Profit Prob. Profit Prob. $150,000 3 / 4 = 0.75 $240,000 2 / 4 = 0.50 -$80,000 1 / 4 = 0.25 -$48,000 2 / 4 = 0.50 Step 1) Enter Profit into List 1 Step 2) Enter Prob. into List 2 Step 3) CALC 1-VAR mean = $92,500 (expected profit) and standard deviation = $99,592.92 for Brampton mean = $96,000 (expected profit) and standard deviation = $144,000 for Mississauga Now we have to determine the coefficient of variation to determine which location is riskier. CV Brampton = (σ / µ) x 100 = ($99,592.92 / $92,500) x 100 = 107.67% CV Mississauga = (σ / µ) x 100 = ($144,000 / $96,200) x 100 = 150%

Therefore, the company should locate itself in Mississauga because it has a higher expected profit of $96,000, but it is also riskier as the coefficient of variation is equal to 150%. Example 3 (binomial distribution): If 100 people are sampled and it is determined that 40% of them prefer to pay their telephone bills online, what is the probability that at least 30 customers paid their telephone bills online? Since each trial is independent, we can use the binomial distribution to solve for the probability. If we use the Bpd function, it would only tell us an exact number of customers so we have to use the Bcd function to calculate the probability over a certain number of customers to calculate our answer. DIST BINM Bcd (x = 29, n = 100, p = 0.40) = P ( 29), but we want the P ( 30) so we have to subtract 1 - P ( 29) = P ( 30) and the calculator always tells us (less than or equal to) a certain #. P ( 29) = 0.01477531 P ( 30) = 1 0.01477351 = 0.98522469 Type 10 (an exact number) More than/at least/less than 10 Less than 10 Bpd Bcd subtract from 1 (we would use 9 in the calculator) At most or no more than 10 *If we are asked to calculate more than 10 or a certain number depending on the question, we have to subtract 1 P ( 10) because the calculator only tells us the probability of the left side or less than and equal to 10 or whatever the number so the probability of P ( 10) would be found by 1 P ( 9). Example 4 (poisson distribution): A computer in Mark s hardware store breaks down an average of 2 times per month. What is the probability that the computer will break down 15 times this year? We have no sample of computers provided to us in this question nor are we given a probability to help us determine our answer so now we have to use the poisson distribution to solve this question. We also have to use the Ppd function instead of the Pcd (the cumulative distribution) one because we are looking for exactly 15 break downs, not more than 15, less than or equal to 15 or at least 15. DIST F6 POISN Ppd (x = 15, µ = 2 x 12 = 24) = 0.01457476 Chapter 7 The Normal Distribution The Empirical Rule helps us determine the percentage of how much of our data is within x number of standard deviations from the mean and it is demonstrated through the table below: Within ±1 standard deviation Within ±2 standard deviation Within ±3 standard deviation Approximately 68% of data Approximately 95% of your data Approximately 99.7% of your data The Chebyshev Rule: for any data set, regardless of shape/distribution, the percentage of values that are found within distances of k standard deviations from the mean must be at least (1 1/k 2 ) x 100%. This rule is applied to any value of k greater than 1 so if k = 2, we will get 75% as the solution and this means that 75% of the values of your data must be found or within ±2 standard deviations of the mean.

Example 1 (normal distribution): Suppose that during any hour in a large department store, the average number of shoppers is 448, with a standard deviation of 21 shoppers. What is the probability that a random sample of 49 different shopping hours will yield a sample mean between 441 and 446 shoppers inclusive? As we are provided with a mean (average), a standard deviation and a sample, we can use the normal distribution to solve for the probability in this question. Since we are provided with a sample of 49 (n), we have to account for that with the standard deviation and find the standard error, which is found through standard deviation / square root of the sample (n) = 21 / (square root of 49 = 7) = 3. DIST NORM Ncd (441, 446, 3, 448) = 0.2426772 Example 2 (normal distribution): If 25% of the customers with the largest bank account balances receive a free entry into a contest where they can win a trip to Europe and the balances are normally distributed with a mean balance of $4,300 and a standard deviation of $720, what is the minimum balance required to get a free entry into this contest? This question is not asking us for a probability, but a minimum value and we can do this through the inverse normal function on the calculator as we are given an area with which to use in answering the question. Since we are asked for the minimum balance, we have to look at the right side or right tail of our data to calculate our answer. If we were asked for the maximum balance, we would use the left tail. DIST NORM InvN (Right, 0.25, 720, 4300) = $4,785.63262 Below is a chart of what information to look for when conducting a particular type of distribution and what function to use when calculating for an answer. Distribution Variable () Sample (N) Probability Mean Standard Area (Average) Deviation Binomial Poisson Normal *For a normal distribution, if we are given a sample in the question, it may pertain to the standard deviation in which we will have to use the formula to convert the value into the standard error (standard deviation / square root of N) and then use that number in order to solve the particular question. Chapter 8 Sampling & Sampling Distributions Simple Random Sample every item from a frame has the same chance of selection as every other item (example: picking apples out of a basket). Sampling with Replacement after you select an item, you return it to the frame where it has the same probability of being selected again (example: picking an apple out of a basket, putting it back in and picking another apple out of the basket). Sampling without Replacement once you select an item, you cannot select it again. Sampling Distribution distribution of the results if you actually selected all possible samples. Sampling Distribution of the Mean distribution of all possible sample means if you select al possible samples of a given size. Systematic Sample to select a systematic sample, you choose the first item to be selected at random

from the first k items in the frame where k = N / n and N = total sample and n = the groups of the sample that you framed. Stratified Sample first you divide the N items in the frame into separate subpopulations or strata and this is more efficient than either simple random sampling or systematic sampling because you are ensured of the representation of items across the entire population. Cluster Sample you divide the N items in the frame into several clusters so that each cluster is representative of the entire population and this is often more cost-effective than simple random sampling particularly if the population is spread over a wide geographic region. The Central Limit Theorem states that as the sample size gets large enough, the sampling distribution of the mean is approximately normally distributed regardless of the shape of the distribution of the individual values in the population. As a general rule, when the sample is 30, the sampling distribution of the mean is approximately normal. Population Proportion (π) proportion of items in the entire population with the characteristic of interest so for example, if 3 customers out of a sample of 10 preferred your hair product, then we divide 3 / 10 to get our proportion of 0.30. Example 1 (determining the upper and lower intervals): If the average of the weights of the cereal boxes is 368 grams and the standard deviation is 15 grams, find an interval symmetrically distributed around the population mean that will include 95% of the sample means based on a sample of 25 cereal boxes. When looking for an interval, we have to find the lower boundary and the upper boundary of which a certain percentage (95% in this example) of our sample means are to lie. Now our first step is to use the inverse normal function to find out the z value. Our tail will be central because our interval is symmetrically distributed, our area will be 0.95, and we have to assume that our data is normally distributed to solve for our z value so our standard deviation is equal to 1 and our mean is equal to 0. InvN (Center, 0.95, 1, 0) = ±1.96 the negative value will be used in the lower and positive for upper. Now we can solve for our widths or intervals and the formulas are as follows: L = mean + z (standard error of the mean) 368 + [(-1.96) x 15 / square root of 25] = 368 5.22 U = mean z (standard error of the mean) 368 (1.96 x 15 / square root of 25) = 368 + 5.22 L = 362.12 and U = 373.88 Therefore, 95% of all the sample means based on samples of 25 cereal boxes are between 362.12 and 373.88 grams. Example 2 (finding Z for the sampling distribution of the proportion): Suppose that a local branch manager of a bank determines than 40% of all depositors have multiple accounts with the bank. If you select a random sample of 200 depositors, the probability that the sample proportion of depositors with multiple accounts is less 0.30 is calculated as follows: Z = p π / square root of [π x (1- π) / n] Z = 0.30 0.40 / square root of [(0.40 x 0.60) / 200] Z = - 0.10 / 0.0346 = - 2.89 If we look at the table for all the z-score values under the normal curve, we can determine that the area under the curve less than -2.89 is equal to 0.0019. Therefore, if the true proportion of items in the population is 0.40, then only 0.19% of the samples of n = 200 would be expected to have sample proportions less than 0.30.