Probability & Sampling The Practice of Statistics 4e Mostly Chpts 5 7

Similar documents
Chapter 6: Random Variables

Random Variables CHAPTER 6.3 BINOMIAL AND GEOMETRIC RANDOM VARIABLES

CHAPTER 6 Random Variables

Chapter 8: Binomial and Geometric Distributions

Probability & Statistics Chapter 5: Binomial Distribution

Section 6.3 Binomial and Geometric Random Variables

Random Variables. Chapter 6: Random Variables 2/2/2014. Discrete and Continuous Random Variables. Transforming and Combining Random Variables

CHAPTER 6 Random Variables

Chapter 6: Random Variables

Binomial Random Variable - The count X of successes in a binomial setting

Probability Review. The Practice of Statistics, 4 th edition For AP* STARNES, YATES, MOORE

Random Variables. 6.1 Discrete and Continuous Random Variables. Probability Distribution. Discrete Random Variables. Chapter 6, Section 1

CHAPTER 6 Random Variables

CHAPTER 6 Random Variables

Chapter 6: Random Variables

Chapter 6: Random Variables

Section 7.4 Transforming and Combining Random Variables (DAY 1)

Chapter 6: Random Variables

CHAPTER 6 Random Variables

chapter 13: Binomial Distribution Exercises (binomial)13.6, 13.12, 13.22, 13.43

Probability is the tool used for anticipating what the distribution of data should look like under a given model.

Econ 6900: Statistical Problems. Instructor: Yogesh Uppal

Chapter 6 Section 3: Binomial and Geometric Random Variables

The Binomial distribution

Chapter 7. Random Variables

Chapter 7: Random Variables

Part V - Chance Variability

Chapter 8 Solutions Page 1 of 15 CHAPTER 8 EXERCISE SOLUTIONS

Chapter 8. Variables. Copyright 2004 Brooks/Cole, a division of Thomson Learning, Inc.

CHAPTER 10: Introducing Probability

Counting Basics. Venn diagrams

Theoretical Foundations

Sampling Distributions

+ Chapter 7. Random Variables. Chapter 7: Random Variables 2/26/2015. Transforming and Combining Random Variables

CHAPTER 6 Random Variables

Lecture 3. Sampling distributions. Counts, Proportions, and sample mean.

ECON 214 Elements of Statistics for Economists 2016/2017

Section 6.2 Transforming and Combining Random Variables. Linear Transformations

6.2.1 Linear Transformations

Example - Let X be the number of boys in a 4 child family. Find the probability distribution table:

Chapter 5 Basic Probability

Chapter 6: Random Variables. Ch. 6-3: Binomial and Geometric Random Variables

AP Statistics Chapter 6 - Random Variables

Lesson 97 - Binomial Distributions IBHL2 - SANTOWSKI

Opening Exercise: Lesson 91 - Binomial Distributions IBHL2 - SANTOWSKI

Unit 04 Review. Probability Rules

Example - Let X be the number of boys in a 4 child family. Find the probability distribution table:

Statistics for IT Managers

Lecture 6 Probability

Lecture 9. Probability Distributions. Outline. Outline

Sampling Distributions For Counts and Proportions

Chapter 8.1.notebook. December 12, Jan 17 7:08 PM. Jan 17 7:10 PM. Jan 17 7:17 PM. Pop Quiz Results. Chapter 8 Section 8.1 Binomial Distribution

Lecture 9. Probability Distributions

Chapter 4. Section 4.1 Objectives. Random Variables. Random Variables. Chapter 4: Probability Distributions

Binomial Random Variables. Binomial Random Variables

Example. Chapter 8 Probability Distributions and Statistics Section 8.1 Distributions of Random Variables

ECON 214 Elements of Statistics for Economists 2016/2017

Chapter 7: Sampling Distributions Chapter 7: Sampling Distributions

AP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE

Sec$on 6.1: Discrete and Con.nuous Random Variables. Tuesday, November 14 th, 2017

CH 5 Normal Probability Distributions Properties of the Normal Distribution

Statistical Methods in Practice STAT/MATH 3379

HHH HHT HTH THH HTT THT TTH TTT

2011 Pearson Education, Inc

Chapter 8 Binomial and Geometric Distribu7ons

Chapter 3 - Lecture 5 The Binomial Probability Distribution

MAKING SENSE OF DATA Essentials series

BINOMIAL EXPERIMENT SUPPLEMENT

SECTION 6.2 (DAY 1) TRANSFORMING RANDOM VARIABLES NOVEMBER 16 TH, 2017

Chpt The Binomial Distribution

Probability Distributions for Discrete RV

Binomial Distributions

Stats CH 6 Intro Activity 1

Examples: Random Variables. Discrete and Continuous Random Variables. Probability Distributions

Section Introduction to Normal Distributions

4.1 Probability Distributions

Chapter 4 and 5 Note Guide: Probability Distributions

MANAGEMENT PRINCIPLES AND STATISTICS (252 BE)

the number of correct answers on question i. (Note that the only possible values of X i

Statistical Methods for NLP LT 2202

The normal distribution is a theoretical model derived mathematically and not empirically.

LECTURE 6 DISTRIBUTIONS

Sampling Distributions and the Central Limit Theorem

Chapter 5. Sampling Distributions

We use probability distributions to represent the distribution of a discrete random variable.

Lecture 7 Random Variables

***SECTION 8.1*** The Binomial Distributions

STT315 Chapter 4 Random Variables & Probability Distributions AM KM

8.1 Estimation of the Mean and Proportion

6.1 Discrete and Continuous Random Variables. 6.1A Discrete random Variables, Mean (Expected Value) of a Discrete Random Variable

Midterm Exam III Review

Binomial Random Variables. Binomial Distribution. Examples of Binomial Random Variables. Binomial Random Variables

Basic Procedure for Histograms

Sampling. Marc H. Mehlman University of New Haven. Marc Mehlman (University of New Haven) Sampling 1 / 20.

Homework: Due Wed, Nov 3 rd Chapter 8, # 48a, 55c and 56 (count as 1), 67a

Statistics for Business and Economics

Random variables The binomial distribution The normal distribution Sampling distributions. Distributions. Patrick Breheny.

Stat 211 Week Five. The Binomial Distribution

A probability distribution shows the possible outcomes of an experiment and the probability of each of these outcomes.

Statistics and Probability

Transcription:

Probability & Sampling The Practice of Statistics 4e Mostly Chpts 5 7 Lew Davidson (Dr.D.) Mallard Creek High School Lewis.Davidson@cms.k12.nc.us 704-786-0470

Probability & Sampling The Practice of Statistics 4e Mostly Chpts 5 7 Classical Probability (Ch 5) Random Variables (Ch 6) Sampling Theory (Ch 7) Sampling Distributions (Chpts. 8-12) (Basis for all of Inference)

Section 5.1-5.3 Classical Probability Learning Objectives Probabilities from Two-Way Tables, Venn and Tree diagrams

Probability Rules Summary - 1 1. Random Process: P(event) = long term frequency of occurrence 2. PA ( ) number of weighted outcomes corresponding to event total number of weighted outcomes in sample space A 3. A probability model: lists the possible outcomes in the sample space S and the probability for each (Discrete Variables). 4. For any event A: 0 P(A) 1 P(sample space) = P(S) = 1 5. P(complement) = P(A C ) = 1 P(A) 6. A two-way table or a Venn diagram can be used to display the sample space for a chance process

Probability Rules Continued - 2 7. Intersection (A B) of A & B = all outcomes in both A & B 8. Union (A B) = all outcomes in event A, event B, or both. 9. P(A or B) = P(A U B) = P(A) + P(B) P(A B) 10. Events A and B are mutually exclusive (disjoint) if they have no outcomes in common. If so: P(A B) = 0 & P(A or B) = P(A) + P(B) 11. Two-way tables, Tree & Venn diagrams can be used to display the sample space for a chance process.

Probability Rules Continued 3 12. Intersection (A B) of A & B = all outcomes in both A & B 13. Union (A B) = all outcomes in event A, event B, or both. 14. P(A or B) = P(A U B) = P(A) + P(B) P(A B) 15. Events A and B are mutually exclusive (disjoint) if they have no outcomes in common. If so: P(A B) = 0 & P(A or B) = P(A) + P(B) 16. If both A & B are independent: P(A B) = P(A B)P(B) = P(A) P(B) 17. Check: Are disjoints events independent? a. Yes b. No c. Not necessarily d. Not enough information to say

Example: Grade Distributions (Two Way Table) E: the grade comes from an Eng. & Phy Sci course L: the grade is lower than a B. Total 6300 1600 2100 Total 3392 2952 3656 10000 Find P(L) Find P(E L) Find P(L E) P(L) = 3656 / 10000 = 0.3656 P(E L) = 800 / 3656 = 0.2188 P(L E) = 800 / 1600 = 0.5000

Example: What percent of all adult Internet users visit video-sharing sites? P(video yes 18 to 29) = 0.27 0.7 = 0.1890 P(video yes 30 to 49) = 0.45 0.51 = 0.2295 P(video yes 50 +) = 0.28 0.26 = 0.0728 P(video yes) = 0.1890 + 0.2295 + 0.0728 = 0.4913

P(A B) = P(A and B) = P(A intersect B) General Multiplication Rule The probability that events A and B both occur can be found using the general multiplication rule P(A B) = P(A) P(B A) where P(B A) is the conditional probability that event B occurs given that event A has already occurred.

Example: Teens with Online Profiles 93% of teenagers use the Internet, 55% of online teens have posted a profile on a social-networking site. a) Draw a T diagram for this situation b) What percent of teens are online and have posted a profile? P(online) 0.93 P(profile online) 0.55 51.15% of teens are online and have posted a profile. P(online and have profile) P(online) P(profile online) (0.93)(0.55) 0.5115

General Multiplication Rule P(A P(A B) B) = P(A) P(B A) Conditional Probability Formula To find the conditional probability P(B A), use the formula =

What is the probability that a randomly selected resident who reads USA Today also reads the New York Times? P(B A) P(A B) P(A) P(A B) 0.05 P(A) 0.40 P(B A) 0.05 0.40 0.125 reads USA Today also reads the New York Times. There is a 12.5% chance that a randomly selected resident who

Summary 18. If one event has happened, the chance that another event will happen is a conditional probability. P(B A) represents the probability that event B occurs given that event A has occurred. 19. Events A and B are independent if the chance that event B occurs is not affected by whether event A occurs. If 2 events are mutually exclusive (disjoint), they cannot be independent. 20. When chance behavior involves a sequence of outcomes, a tree diagram can be used to describe the sample space. 21. The general multiplication rule states that the probability of events A and B occurring together is: P(A B)=P(A) P(B A) 22. In the special case of independent events: P(A B)=P(A) P(B) 23. The conditional probability formula is : P(B A) = P(A B) / P(A)

Review http://tinyurl.com/drd-inference http://stattrek.com /ap-statistics/test-preparation.aspx http://tinyurl.com/drd-mc

Example: Random Variable and Probability Distribution A probability model describes the possible outcomes of a chance process and the likelihood that those outcomes will occur. A numerical variable that describes the outcomes of a chance process is called a random variable. The probability model for a random variable is its probability distribution Definition: A random variable takes numerical values that describe the outcomes of some chance process. The probability distribution of a random variable gives its possible values and their probabilities. Consider tossing a fair coin 3 times. Define X = the number of heads obtained X = 0: TTT X = 1: HTT THT TTH X = 2: HHT HTH THH X = 3: HHH Value 0 1 2 3 Probability 1/8 3/8 3/8 1/8 Discrete and Continuous Random Variables

There are two main types of random variables: discrete and continuous. If we can list all possible outcomes for a random variable and assign probabilities to each one, we have a discrete random variable. Discrete Random Variables and Their Probability Distributions A discrete random variable X takes a fixed set of possible values with gaps between. The probability distribution of a discrete random variable X lists the values x i and their probabilities p i : Value: x 1 x 2 x 3 Probability: p 1 p 2 p 3 The probabilities p i must satisfy two requirements: 1. Every probability p i is a number between 0 and 1. 2. The sum of the probabilities is 1. To find the probability of any event, add the probabilities p i of the particular values x i that make up the event.

Discrete random variables usually by counting Continuous random variables usually by measuring A continuous random variable X takes on all values in an interval of numbers. The probability distribution of X is a density curve. (Area= 1, Curve never negative) Probability of any event is the area under the density curve corresponding to the interval infinitely many possible values

Define Y as the height of a randomly chosen young woman. Y is a continuous random variable whose probability distribution is N(64, 2.7). What is the probability that a randomly chosen young woman has height between 68 and 70 inches? P(68 Y 70) =??? 68 64 z 2.7 1.48 70 64 z 2.7 2.22 P(1.48 Z 2.22) = P(Z 2.22) P(Z 1.48) = 0.9868 0.9306 = 0.0562 There is about a 5.6% chance that a randomly chosen young woman has a height between 68 and 70 inches.

Probability of an Exact value in a continuous distribution? P ( x = 3) =?

Summary X 2 Section 6.1 Discrete and Continuous Random Variables The mean of a random variable is the long-run average value of the variable after many repetitions of the chance process. It is also known as the expected value of the random variable. The expected value of a discrete random variable X is x x i p i x 1 p 1 x 2 p 2 x 3 p 3... The variance of a random variable is the average squared deviation of the values of the variable from their mean. The standard deviation is the square root of the variance. For a discrete random variable X, (x i X ) 2 p i (x 1 X ) 2 p 1 (x 2 X ) 2 p 2 (x 3 X ) 2 p 3...

Linear Transformations 1.Adding (or subtracting) a constant, a, to each observation: Adds a to measures of center and location. Does not change the shape or measures of spread. 2.Multiplying (or dividing) each observation by a constant, b: Multiplies (divides) measures of center and location by b. Multiplies (divides) measures of spread by b. Does not change the shape of the distribution.

Linear Transformations How does adding or subtracting a constant affect a random variable? Effect on a Random Variable of Adding (or Subtracting) a Constant Adding the same number a (which could be negative) to each value of a random variable: Adds a to measures of center and location (mean, median, quartiles, percentiles). Does not change measures of spread (range, IQR, standard deviation). Does not change the shape of the distribution. Transforming and Combining Random Variables

Linear Transformations How does multiplying or dividing by a constant affect a random variable? Effect on a Random Variable of Multiplying (Dividing) by a Constant Multiplying (or dividing) each value of a random variable by a number b: Multiplies (divides) measures of center and location (mean, median, quartiles, percentiles) by b. Multiplies (divides) measures of spread (range, IQR, standard deviation) by b. Does not change the shape of the distribution. Note: Multiplying a random variable by a constant b multiplies the variance by b 2. Transforming and Combining Random Variables

Linear Transformations Whether we are dealing with data or random variables, the effects of a linear transformation are the same. If Y = a + bx is a linear transformation of the random variable X, then The probability distribution of Y has the same shape as the probability distribution of X. µ Y = a + bµ X. Effect on a Linear Transformation on the Mean and Standard Deviation σ Y = b σ X (since b could be a negative number). Transforming and Combining Random Variables

Combining Random Variables As the preceding example illustrates, when we add two independent random variables, their variances add. Standard deviations do not add. Variance of the Sum of Random Variables For any two independent random variables X and Y, if T = X + Y, then the variance of T is In general, the variance of the sum of several independent random variables is the sum of their variances. T 2 X 2 Y 2 Remember that you can add variances only if the two random variables are independent, and that you can NEVER add standard deviations! Transforming and Combining Random Variables

Combining Random Variables We can perform a similar investigation to determine what happens when we define a random variable as the difference of two random variables. In summary, we find the following: Mean of the Difference of Random Variables For any two random variables X and Y, if D = X - Y, then the expected value of D is E(D) = µ D = µ X - µ Y In general, the mean of the difference of several random variables is the difference of their means. The order of subtraction is important! Variance of the Difference of Random Variables For any two independent random variables X and Y, if D = X - Y, then the variance of D is D 2 X 2 Y 2 In general, the variance of the difference of two independent random variables is the sum of their variances. Transforming and Combining Random Variables

Combining Normal Random Variables So far, we have concentrated on finding rules for means and variances of random variables. If a random variable is Normally distributed, we can use its mean and standard deviation to compute probabilities. An important fact about Normal random variables is that any sum or difference of independent Normal random variables is also Normally distributed. Example Mr. Starnes likes between 8.5 and 9 grams of sugar in his hot tea. Suppose the amount of sugar in a randomly selected packet follows a Normal distribution with mean 2.17 g and standard deviation 0.08 g. If Mr. Starnes selects 4 packets at random, what is the probability his tea will taste right? Let X = the amount of sugar in a randomly selected packet. Then, T = X 1 + X 2 + X 3 + X 4. We want to find P(8.5 T 9). µ T = µ X1 + µ X2 + µ X3 + µ X4 = 2.17 + 2.17 + 2.17 +2.17 = 8.68 2 T 2 2 2 2 X1 X 2 X 3 X 4 T 0.0256 0.16 (0.08) 2 (0.08) 2 (0.08) 2 (0.08) 2 0.0256 Transforming and Combining Random Variables

Could we have solved this problem by saying that T = 4X, since there were 4 packets of sugar? µ T = µ X1 + µ X2 + µ X3 + µ X4 = 2.17 + 2.17 + 2.17 +2.17 = 8.68 µ T = µ X+X+X+X = 4µ X = 4 * 2.17 = 8.68 SO, OK for the Expected Values E(X)!

Could we have solved this problem by consider T = 4X, since there were 4 packets of sugar? 2 T 2 X 1 2 X 2 2 X 3 2 T 2 X 4 2 4 (0.08) X 2 (0.08) 4*(0.08) 2 2 (0.08) 0.0256 2 (0.08) 2 0.0256 4 X SO : 0.0256 0.16 T X X X X but 4*.08.32 X X X X 4 X

Summary Transforming and Combining Random Variables Adding a constant a (which could be negative) to a random variable increases (or decreases) the mean of the random variable by a but does not affect its standard deviation or the shape of its probability distribution. Multiplying a random variable by a constant b (which could be negative) multiplies the mean of the random variable by b and the standard deviation by b but does not change the shape of its probability distribution. A linear transformation of a random variable involves adding a constant a, multiplying by a constant b, or both. If we write the linear transformation of X in the form Y = a + bx, the following about are true about Y: Shape: same as the probability distribution of X. Center: µ Y = a + bµ X Spread: σ Y = b σ X

Section 6.2 Transforming and Combining Random Variables Summary If X and Y are any two random variables, X Y X Y If X and Y are independent random variables 2 X Y X 2 Y 2 The sum or difference of independent Normal random variables follows a Normal distribution.

Binomial Settings When the same chance process is repeated several times, we are often interested in whether a particular outcome does or doesn t happen on each repetition. In some cases, the number of repeated trials is fixed in advance and we are interested in the number of times a particular event (called a success ) occurs. If the trials in these cases are independent and each success has an equal chance of occurring, we have a binomial setting. Definition: A binomial setting arises when we perform several independent trials of the same chance process and record the number of times that a particular outcome occurs. The four conditions for a binomial setting are B I N S Binary? The possible outcomes of each trial can be classified as success or failure. Independent? Trials must be independent; that is, knowing the result of one trial must not have any effect on the result of any other trial. Number? The number of trials n of the chance process must be fixed in advance. Success? On each trial, the probability p of success must be the same. Binomial and Geometric Random Variables

Binomial Random Variable Consider tossing a coin n times. Each toss gives either heads or tails. Knowing the outcome of one toss does not change the probability of an outcome on any other toss. If we define heads as a success, then p is the probability of a head and is 0.5 on any toss. The number of heads in n tosses is a binomial random variable X. The probability distribution of X is called a binomial distribution. Definition: The count X of successes in a binomial setting is a binomial random variable. The probability distribution of X is a binomial distribution with parameters n and p, where n is the number of trials of the chance process and p is the probability of a success on any one trial. The possible values of X are the whole numbers from 0 to n. Note: When checking the Binomial condition, be sure to check the BINS and make sure you re being asked to count the number of successes in a certain number of trials! Binomial and Geometric Random Variables

Example Binomial Probabilities In a binomial setting, we can define a random variable (say, X) as the number of successes in n independent trials. We are interested in finding the probability distribution of X. Each child of a particular pair of parents has probability 0.25 of having type O blood. Genetics says that children receive genes from each of their parents independently. If these parents have 5 children, the count X of children with type O blood is a binomial random variable with n = 5 trials and probability p = 0.25 of a success on each trial. In this setting, a child with type O blood is a success (S) and a child with another blood type is a failure (F). What s P(X = 2)? P(SSFFF) = (0.25)(0.25)(0.75)(0.75)(0.75) = (0.25) 2 (0.75) 3 = 0.02637 However, there are a number of different arrangements in which 2 out of the 5 children have type O blood: SSFFF SFSFF SFFSF SFFFS FSSFF FSFSF FSFFS FFSSF FFSFS FFFSS Verify that in each arrangement, P(X = 2) = (0.25) 2 (0.75) 3 = 0.02637 Binomial and Geometric Random Variables Therefore, P(X = 2) = 10(0.25) 2 (0.75) 3 = 0.2637

Binomial Coefficient Note, in the previous example, any one arrangement of 2 S s and 3 F s had the same probability. This is true because no matter what arrangement, we d multiply together 0.25 twice and 0.75 three times. We can generalize this for any setting in which we are interested in k successes in n trials. That is, Definition: P(X k) P(exactly k successes in n trials) = number of arrangements p k (1 p) n k The number of ways of arranging k successes among n observations is given by the binomial coefficient for k = 0, 1, 2,, n where and 0! = 1. n k n! k!(n k)! n! = n(n 1)(n 2) (3)(2)(1) Binomial and Geometric Random Variables

Binomial Probability The binomial coefficient counts the number of different ways in which k successes can be arranged among n trials. The binomial probability P(X = k) is this count multiplied by the probability of any one specific arrangement of the k successes. Binomial Probability If X has the binomial distribution with n trials and probability p of success on each trial, the possible values of X are 0, 1, 2,, n. If k is any one of these values, Number of arrangements of k successes P(X k) n p k (1 p) n k k Probability of k successes Probability of n- k failures Binomial and Geometric Random Variables

Mean and Standard Deviation of a Binomial Random Variable If a count X has the binomial distribution with number of trials n and probability of success p, the mean and standard deviation of X are X np X np(1 p) As a rule of thumb, we will use the Normal approximation when n is so large that np 10 and n(1 p) 10. That is, the expected number of successes and failures are both at least 10. NORMALICY CONDITION

Geometric Random Variable. Definition: The number of trials Y that it takes to get a success in a geometric setting is a geometric random variable. The probability distribution of Y is a geometric distribution with parameter p, the probability of a success on any trial. The possible values of Y are 1, 2, 3,. Note: Like binomial random variables, it is important to be able to distinguish situations in which the geometric distribution does and doesn t apply! Binomial and Geometric Random Variables

Geometric Settings In a binomial setting, the number of trials n is fixed and the binomial random variable X counts the number of successes. In other situations, the goal is to repeat a chance behavior until a success occurs. These situations are called geometric settings. Definition: A geometric setting arises when we perform independent trials of the same chance process and record the number of trials until a particular outcome occurs. The four conditions for a geometric setting are B I T S Binary? The possible outcomes of each trial can be classified as success or failure. Independent? Trials must be independent; that is, knowing the result of one trial must not have any effect on the result of any other trial. Trials? The goal is to count the number of trials until the first success occurs. Success? On each trial, the probability p of success must be the same.

Example: The Birthday Game Read the activity on page 398. The random variable of interest in this game is Y = the number of guesses it takes to correctly identify the birth day of one of your teacher s friends. What is the probability the first student guesses correctly? The second? Third? What is the probability the k th student guesses corrrectly? Verify that Y is a geometric random variable. B: Success = correct guess, Failure = incorrect guess I: The result of one student s guess has no effect on the result of any other guess. T: We re counting the number of guesses up to and including the first correct guess. S: On each trial, the probability of a correct guess is 1/7. Calculate P(Y = 1), P(Y = 2), P(Y = 3), and P(Y = k) P(Y 1) 1/7 P(Y 2) (6/7)(1/7) 0.1224 P(Y 3) (6/7)(6/7)(1/7) 0.1050 Notice the pattern? TREE!!!! Geometric Probability If Y has the geometric distribution with probability p of success on each trial, the possible values of Y are 1, 2, 3,. If k is any one of these values, P(Y k) (1 p) k 1 p

Mean of a Geometric Distribution The table below shows part of the probability distribution of Y. We can t show the entire distribution because the number of trials it takes to get the first success could be an incredibly large number. y i 1 2 3 4 5 6 p i 0.143 0.122 0.105 0.090 0.077 0.066 Shape: The heavily right-skewed shape is characteristic of any geometric distribution. That s because the most likely value is 1. Center: The mean of Y is µ Y = 7. We d expect it to take 7 guesses to get our first success. Spread: The standard deviation of Y is σ Y = 6.48. If the class played the Birth Day game many times, the number of homework problems the students receive would differ from 7 by an average of 6.48. Mean (Expected Value) of Geometric Random Variable If Y is a geometric random variable with probability p of success on each trial, then its mean (expected value) is E(Y) = µ Y = 1/p. Binomial and Geometric Random Variables

Binomial and Geometric Random Variables Summary A binomial setting consists of n independent trials of the same chance process, each resulting in a success or a failure, with probability of success p on each trial. The count X of successes is a binomial random variable. Its probability distribution is a binomial distribution. BINS The binomial coefficient counts the number of ways k successes can be arranged among n trials. Think n choose k If X has the binomial distribution with parameters n and p, the possible values of X are the whole numbers 0, 1, 2,..., n. The binomial probability of observing k successes in n trials is (USE CALC) n k n k P( X k) p (1 p) binomialpdf ( n, p, k) k

Summary Section 6.3 Binomial and Geometric Random Variables In this section, we learned that The mean and standard deviation of a binomial random variable X are X np X np(1 p) The Normal approximation to the binomial distribution says that if X is a count having the binomial distribution with parameters n and p, then when n is large, X is approximately Normally distributed. We will use this approximation when np 10 and n(1 - p) 10.

Summary Section 6.3 Binomial and Geometric Random Variables In this section, we learned that A geometric setting consists of repeated trials of the same chance process in which each trial results in a success or a failure; trials are independent; each trial has the same probability p of success; and the goal is to count the number of trials until the first success occurs. If Y = the number of trials required to obtain the first success, then Y is a geometric random variable. Its probability distribution is called a geometric distribution. If Y has the geometric distribution with probability of success p, the possible values of Y are the positive integers 1, 2, 3,.... The geometric probability that Y takes any value is USE Calc P(Y k) (1 p) k 1 p The mean (expected value) of a geometric random variable Y is 1/p.

SAMPLING DISTRIBUTIONS

Population Sample Collect data from a representative Sample... Make an Inference about the Population.

SAMPLING DISTRIBUTIONS Different random samples yield different statistics. Thus a statistic is a random variable The sampling distribution shows all possible statistic values

sampling variability: the value of a statistic varies in repeated random sampling. To make sense of sampling variability, we ask, What would happen if we took many samples? Population Sample Sample Sample Sample Sample Sample? Sample Sample

Population Distributions vs. Sampling Distributions There are actually three distinct distributions involved when we sample repeatedly and measure a variable of interest. 1) The population distribution gives the values of the variable for all the individuals in the population. 2) The distribution of sample data shows the values of the variable for all the individuals in the sample. 3) The sampling distribution shows the statistic values from all the possible samples of the same size from the population.

The Sampling Distribution of We can summarize the facts about the sampling distribution of ˆ p as follows: Sampling Distribution of a Sample Proportion Choose an SRS of size n from a population of size N with proportion p of successes. Let p ˆ be the sample proportion of successes. Then: The mean of the sampling distribution of p ˆ is p ˆ p The standard deviation of the sampling distribution of p ˆ is p(1 p) p ˆ n as long as the 10% condition is satisfied: n (1/10)N. As n increases, the sampling distribution becomes approximately Normal. Before you perform Normal calculations, check that the Normal condition is satisfied: np 10 and n(1 p) 10. ˆ p Sample Proportions

The Sampling Distribution of In Chapter 6, we learned that the mean and standard deviation of a binomial random variable X are ˆ p X np X np(1 p) Since p ˆ X /n (1/n) X, we are just multiplying the random variable X by a constant (1/n) to get the random variable p ˆ. Therefore, ˆ p 1 n (np) p ˆ p 1 n np(1 p) np(1 p) n 2 p ˆ is an unbiased estimator or p p(1 p) n Sample Proportions As sample size increases, the spread decreases.

Using the Normal Approximation for Inference about a population proportion p is based on the sampling distribution of p ˆ. When the sample size is large enough for np and n(1 p) to both be at least 10 (the Normal condition),the sampling distribution of p ˆ is approximately Normal. A polling organization asks an SRS of 1500 first-year college students how far away their home is. Suppose that 35% of all first-year students actually attend college within 50 miles of home. What is the probability that the random sample of 1500 students will give a result within 2 percentage points of this true value? STATE: We want to find the probability that the sample proportion falls between 0.33 and 0.37 (within 2 percentage points, or 0.02, of 0.35). PLAN: We have an SRS of size n = 1500 drawn from a population in which the proportion p = 0.35 attend college within 50 miles of home. p ˆ 0.35 p ˆ (0.35)(0.65) 1500 0.0123 DO: Since np = 1500(0.35) = 525 and n(1 p) = 1500(0.65)=975 are both greater than 10, we ll standardize and then use Table A to find the desired probability. 0.33 0.35 0.37 0.35 z 1.63 z 1.63 0.123 0.123 P(0.33 ˆ p 0.37) P( 1.63 Z 1.63) 0.9484 0.0516 0.8968 Sample Proportions CONCLUDE: About 90% of all SRSs of size 1500 will give a result within 2 percentage points of the truth about the population. ˆ p

The Sampling Distribution of When we choose many SRSs from a population, the sampling distribution of the sample mean is centered at the population mean µ and is less spread out than the population distribution. Here are the facts. Mean and Standard Deviation of the Sampling Distribution of Sample Means Suppose that x is the mean of an SRSof size n drawn from a large population with mean and standard deviation. Then : The mean of the sampling distribution of x is x The standard deviation of the sampling distribution of x is x n as long as the 10% condition is satisfied: n (1/10)N. x Sample Means Note : These facts about the mean and standard deviation of x are true no matter what shape the population distribution has.

Sampling from a Normal Population We have described the mean and standard deviation of the sampling distribution of the sample mean x but not its shape. That's because the shape of the distribution of x depends on the shape of the population distribution. In one important case,there is a simple relationship between the two distributions. If the population distribution is Normal, then so is the sampling distribution of x. This is true no matter what the sample size is. Sampling Distribution of a Sample Mean from a Normal Population Suppose that a population is Normally distributed with mean and standard deviation. Then the sampling distribution of x has the Normal distribution with mean and standard deviation / n, provided that the 10% condition is met. Sample Means

Example: Young Women s Heights The height of young women follows a Normal distribution with mean µ = 64.5 inches and standard deviation σ = 2.5 inches. Find the probability that a randomly selected Let X young = the height woman of a randomly is selected taller young than woman. X 66.5 is N(64.5, inches. 2.5) 66.5 64.5 z 0.80 P(X 66.5) P(Z 0.80) 1 0.7881 0.2119 2.5 The probability of choosing a young woman at random whose height exceeds 66.5 inches is about 0.21. Find the probability that the mean height of an SRS of 10 young women exceeds 66.5 inches. For an SRS of 10 young women, the sampling distribution of their sample mean height will have a mean and standard deviation x 64.5 x n 2.5 10 0.79 Sample Means Since the population distribution is Normal, the sampling distribution will follow an N(64.5, 0.79) distribution. 66.5 64.5 P(x 66.5) P(Z 2.53) z 2.53 0.79 1 0.9943 0.0057 It is very unlikely (less than a 1% chance) that we would choose an SRS of 10 young women whose average height exceeds 66.5 inches.

The Central Limit Theorem Draw an SRS of size n from any population with mean and finite standard deviation. The central limit theorem (CLT) says that when n is large, the sampling distribution of the sample mean x is approximately Normal. Note: How large a sample size n is needed for the sampling distribution to be close to Normal depends on the shape of the population distribution. More observations are required if the population distribution is far from Normal.

The Central Limit Theorem Consider the strange population distribution from the Rice University sampling distribution applet. Describe the shape of the sampling distributions as n increases. What do you notice? Normal Condition for Sample Means Sample Means If the population distribution is Normal,then so is the sampling distribution of x. This is true no matter what the sample size n is. If the population distribution is not Normal,the central limit theorem tells us that the sampling distribution of x will be approximately Normal in most cases if n 30.

SAMPLING DISTRIBUTIONS Different random samples yield different statistics. Thus a statistic is a random variable The sampling distribution shows all possible statistic values

Review http://tinyurl.com/drd-inference http://stattrek.com /ap-statistics/test-preparation.aspx http://tinyurl.com/drd-mc