Unit2: Probabilityanddistributions. 3. Normal and binomial distributions

Similar documents
Unit2: Probabilityanddistributions. 3. Normal and binomial distributions

Unit2: Probabilityanddistributions. 3. Normal distribution

Nicole Dalzell. July 7, 2014

Milgram experiment. Unit 2: Probability and distributions Lecture 4: Binomial distribution. Statistics 101. Milgram experiment (cont.

Unit 2: Probability and distributions Lecture 4: Binomial distribution

Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need.

Homework: Due Wed, Nov 3 rd Chapter 8, # 48a, 55c and 56 (count as 1), 67a

Chapter 3: Distributions of Random Variables

Section 7.5 The Normal Distribution. Section 7.6 Application of the Normal Distribution

Lecture 23. STAT 225 Introduction to Probability Models April 4, Whitney Huang Purdue University. Normal approximation to Binomial

LECTURE 6 DISTRIBUTIONS

Chapter 3: Distributions of Random Variables

Homework: Due Wed, Feb 20 th. Chapter 8, # 60a + 62a (count together as 1), 74, 82

ECON 214 Elements of Statistics for Economists 2016/2017

AMS7: WEEK 4. CLASS 3

The Normal Probability Distribution

Math 227 Elementary Statistics. Bluman 5 th edition

CH 5 Normal Probability Distributions Properties of the Normal Distribution

ECO220Y Continuous Probability Distributions: Normal Readings: Chapter 9, section 9.10

MTH 245: Mathematics for Management, Life, and Social Sciences

2.) What is the set of outcomes that describes the event that at least one of the items selected is defective? {AD, DA, DD}

Chapter 6. The Normal Probability Distributions

STA258H5. Al Nosedal and Alison Weir. Winter Al Nosedal and Alison Weir STA258H5 Winter / 41

Lecture 8. The Binomial Distribution. Binomial Distribution. Binomial Distribution. Probability Distributions: Normal and Binomial

Chapter 8 Estimation

Chapter Seven. The Normal Distribution

Chapter 6. y y. Standardizing with z-scores. Standardizing with z-scores (cont.)

Chapter 8. Variables. Copyright 2004 Brooks/Cole, a division of Thomson Learning, Inc.

MTH 245: Mathematics for Management, Life, and Social Sciences

7. For the table that follows, answer the following questions: x y 1-1/4 2-1/2 3-3/4 4

1. (9; 3ea) The table lists the survey results of 100 non-senior students. Math major Art major Biology major

8.2 The Standard Deviation as a Ruler Chapter 8 The Normal and Other Continuous Distributions 8-1

CHAPTER 4 DISCRETE PROBABILITY DISTRIBUTIONS

NORMAL RANDOM VARIABLES (Normal or gaussian distribution)

Math 14 Lecture Notes Ch The Normal Approximation to the Binomial Distribution. P (X ) = nc X p X q n X =

Normal Model (Part 1)

ECON 214 Elements of Statistics for Economists

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

Solutions for practice questions: Chapter 15, Probability Distributions If you find any errors, please let me know at

Announcements. Data resources: Data and GIS Services. Project. Lab 3a due tomorrow at 6 PM Project Proposal. Nicole Dalzell.

Midterm Test 1 (Sample) Student Name (PRINT):... Student Signature:... Use pencil, so that you can erase and rewrite if necessary.

Random variables The binomial distribution The normal distribution Other distributions. Distributions. Patrick Breheny.

(# of die rolls that satisfy the criteria) (# of possible die rolls)

No, because np = 100(0.02) = 2. The value of np must be greater than or equal to 5 to use the normal approximation.

Probability & Sampling The Practice of Statistics 4e Mostly Chpts 5 7

Section Introduction to Normal Distributions

Density curves. (James Madison University) February 4, / 20

STAT Chapter 5: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.

MidTerm 1) Find the following (round off to one decimal place):

Unit 04 Review. Probability Rules

The Standard Deviation as a Ruler and the Normal Model. Copyright 2009 Pearson Education, Inc.

8.1 Binomial Distributions

AP Statistics Ch 8 The Binomial and Geometric Distributions

1. Variability in estimates and CLT

Chapter 6: Random Variables and Probability Distributions

continuous rv Note for a legitimate pdf, we have f (x) 0 and f (x)dx = 1. For a continuous rv, P(X = c) = c f (x)dx = 0, hence

Statistics. Marco Caserta IE University. Stats 1 / 56

Normal Distribution. Notes. Normal Distribution. Standard Normal. Sums of Normal Random Variables. Normal. approximation of Binomial.

Random variables The binomial distribution The normal distribution Sampling distributions. Distributions. Patrick Breheny.

Theoretical Foundations

The Central Limit Theorem

INF FALL NATURAL LANGUAGE PROCESSING. Jan Tore Lønning, Lecture 3, 1.9

Chapter 6 Probability

Lecture 6: Chapter 6

Chapter 5. Sampling Distributions

In a binomial experiment of n trials, where p = probability of success and q = probability of failure. mean variance standard deviation

CHAPTER 5 Sampling Distributions

7 THE CENTRAL LIMIT THEOREM

Math 243 Lecture Notes

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Central Limit Theorem

Statistics for Business and Economics: Random Variables:Continuous

Chapter 4 Probability Distributions

CHAPTER 6 Random Variables

Lecture 5 - Continuous Distributions

Chapter 12. Binomial Setting. Binomial Setting Examples

The Binomial Probability Distribution

Statistics 511 Supplemental Materials

χ 2 distributions and confidence intervals for population variance

The Normal Model The famous bell curve

Using the Central Limit Theorem It is important for you to understand when to use the CLT. If you are being asked to find the probability of the

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

Econ 6900: Statistical Problems. Instructor: Yogesh Uppal

Module 4: Probability

Chapter 6: Random Variables. Ch. 6-3: Binomial and Geometric Random Variables

Lecture 12. Some Useful Continuous Distributions. The most important continuous probability distribution in entire field of statistics.

PROBABILITY DISTRIBUTIONS

Binomial Distributions

chapter 13: Binomial Distribution Exercises (binomial)13.6, 13.12, 13.22, 13.43

Examples of continuous probability distributions: The normal and standard normal

Binomial and Normal Distributions

E509A: Principle of Biostatistics. GY Zou

Overview. Definitions. Definitions. Graphs. Chapter 4 Probability Distributions. probability distributions

Counting Basics. Venn diagrams

= 0.35 (or ˆp = We have 20 independent trials, each with probability of success (heads) equal to 0.5, so X has a B(20, 0.5) distribution.

Distributions of random variables

The Normal Distribution

Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc.

Statistical Tables Compiled by Alan J. Terry

Transcription:

Announcements Unit2: Probabilityanddistributions 3. Normal and binomial distributions Sta 101 - Summer 2017 Duke University, Department of Statistical Science PS: Explain your reasoning + show your work will result in point deductions next time Formatting of problem set submissions: Bad: Scanned PDF handwriting, unreadable/too dark, haphazard orientation of pages, out of order will result in point deductions if not legible Better: Scanned PDF neat handwriting, good scan quality, one file containing all pages in order will result in point deductions if not legible Best: Typed using a word processor and PDFed PS 2 and PA 2 due Friday RA 3 on Tuesday Prof. Burris Slides posted at http://www2.stat.duke.edu/courses/summer17/sta101.001-1/ 1 1. Two types of probability distributions: discrete and continuous A discrete probability distribution lists all possible events and the probabilities with which they occur The events listed must be disjoint Each probability must be between 0 and 1 The probabilities must total 1 Example: Binomial distribution A continuous probability distribution differs from a discrete probability distribution in several ways: The probability that a continuous random variable will equal to any specific value is zero. As such, they cannot be expressed in tabular form. Instead, we use an equation or a formula to describe its distribution via a probability density function (pdf). We can calculate the probability for ranges of values the random variable takes (area under the curve). Example: Normal distribution Speeds of cars on a highway are normally distributed with mean 65 miles / hour. The minimum speed recorded is 48 miles / hour and the maximum speed recorded is 83 miles / hour. Which of the following is most likely to be the standard deviation of the distribution? (a) -5 (b) 5 (c) 10 (d) (e) 30 2 3

3. Z scores serve as a ruler for any distribution 3. Z scores serve as a ruler for any distribution A Z score creates a common scale so you can assess data without worrying about the specific units in which it was measured. How can we determine if it would be unusual for an adult woman in North Carolina to be 96 (8 ft) tall? How can we determine if it would be unusual for an adult alien woman(?) to be 103 metreloots tall, assuming the distribution of heights of adult alien women is approximately normal? Z = obs mean SD Z score: number of standard deviations the observation falls above or below the mean Z distribution (also called the standardized normal distribution, is a special case of the normal distribution where µ = 0 and σ = 1 Z N(µ = 0, σ = 1) Defined for distributions of any shape, but only when the distribution is normal can we use Z scores to calculate percentiles Observations with Z > 2 are usually considered unusual 4 5 Using z-scores to calculate probabilities Scores on a standardized test are normally distributed with a mean of 100 and a standard deviation of 20. If these scores are converted to standard normal Z scores, which of the following statements will be correct? (a) The mean will equal 0, but the median cannot be determined. (b) The mean of the standardized Z-scores will equal 100. (c) The mean of the standardized Z-scores will equal 5. (d) Both the mean and median score will equal 0. (e) A score of 70 is considered unusually low on this test. Suppose SAT scores are normally distributed with mean 1000 and standard deviation 200. What is the probability that a randomly chosen student will have an SAT score greater than 1300? z = 1300 1000 200 = 1.5 6 7

High-speed broadband connection at home in the US Considering many scenarios Each person in the poll be thought of as a trial A person is labeled a success if s/he has high-speed broadband connection at home, failure if not Since 70% have high-speed broadband connection at home, probability of success is p = 0.70 8 Suppose we randomly select three individuals from the US, what is the probability that exactly 1 has high-speed broadband connection at home? Let s call these people Anthony (A), Barry (B), Cam (C). Each one of the three scenarios below will satisfy the condition of exactly 1 of them says Yes : Scenario 1: 0.70 (A) yes 0.30 (B) no 0.30 (C) no 0.063 Scenario 2: 0.30 (A) no 0.70 (B) yes 0.30 (C) no 0.063 Scenario 3: 0.30 (A) no 0.30 (B) no 0.70 (C) yes 0.063 The probability of exactly one 1 of 3 people saying Yes is the sum of all of these probabilities. 0.063 + 0.063 + 0.063 = 3 0.063 = 0.189 9 Binomial distribution Binomial distribution (cont.) The question from the prior slide asked for the probability of given number of successes, k, in a given number of trials, n, (k = 1 success in n = 3 trials), and we calculated this probability as # of scenarios P(single scenario) P(single scenario) = p k (1 p) (n k) probability of success to the power of number of successes, probability of failure to the power of number of failures number of scenarios: ( ) n k = n! k!(n k)! The Binomial distribution describes the probability of having exactly k successes in n independent trials with probability of success p. P(k successes in n trials) = ( ) n p k (1 p) (n k) k Note: You can also use R for the calculation of number of scenarios: > choose(5,3) [1] 10 Note: And to compute probabilities > dbinom(1, size = 3, prob = 0.7) [1] 0.189 10 11

Which of the following is not a condition that needs to be met for the binomial distribution to be applicable? (a) the trials must be independent (b) the number of trials, n, must be fixed (c) each trial outcome must be classified as a success or a failure (d) the number of desired successes, k, must be greater than the number of trials (e) the probability of success, p, must be the same for each trial According to the results of the Pew poll suggesting that 70% of Americans have high-speed broadband connection at home, is the probability of exactly 2 out of randomly sampled Americans having such connection at home pretty high or pretty low? (a) pretty high (b) pretty low 12 13 Expected value and standard deviation of binomial According to the results of the Pew poll 70% of Americans have high-speed broadband connection at home, what is the probability that exactly 2 out of randomly sampled Americans have such connection at home? (a) 0.70 2 0.30 13 (b) ( ) 2 0.70 2 0.30 13 (c) ( ) 2 0.70 2 0.30 13 (d) ( ) 2 0.70 13 0.30 2 According to the results of the Pew poll suggestion that 70% of Americans have high-speed broadband connection at home, among a random sample of 100 Americans, how many would you expect to have such connection at home? 100 0.70 = 70 Or more formally, µ = np = 100 0.7 = 70 But this doesn t mean in every random sample of 100 Americans exactly 70 will have high-speed broadband connection at home. In some samples there will be fewer of those, and in others more. How much would we expect this value to vary? σ = np(1 p) = 100 0.70 0.30 4.58 Note: Mean and standard deviation of a binomial might not always be whole numbers, and that is alright, these values represent what we would expect to see on average. 14

https://gallery.shinyapps.io/dist_calc/ Shape of the binomial distribution You can use the normal distribution to approximate binomial probabilities when the sample size is large enough. S-F rule: The sample size is considered large enough if the expected number of successes and failures are both at least 10 np 10 and n(1 p) 10 What is the probability that among a random sample of 1,000 Americans at least three-fourths have high-speed broadband connection at home? Binom(n = 1000, p = 0.7) P(K 750) = P(K = 750) + P(K = 751) + P(K = 752) + + P(K = 1000) 1. Using R: > sum(dbinom(750:1000, size = 1000, prob = 0.7)) [1] 0.00026 2. Using the normal approximation to the binomial: Since we have at least expected successes (1000 0.7 = 700) and 10 expected failures (1000 0.3 = 300), Binom(n = 1000, p = 0.7) N(µ = 1000 0.7, σ = 1000 0.7 0.3) 16 17 Summary of main ideas 1. Two types of probability distributions: discrete and continuous 2. Normal distribution is unimodal, symmetric, and follows the 68-95-99.7 rule 3. Z scores serve as a ruler for any distribution 4. Binomial distribution is used for calculating the probability of exact number of successes for a given number of trials 5. Expected value and standard deviation of the binomial can be calculated using its parameters n and p 6. Shape of the binomial distribution approaches normal when the S-F rule is met 18