Review. Binomial random variable

Similar documents
Probability: Week 4. Kwonsang Lee. University of Pennsylvania February 13, 2015

Statistics 6 th Edition

Populations and Samples Bios 662

Probability Distributions II

continuous rv Note for a legitimate pdf, we have f (x) 0 and f (x)dx = 1. For a continuous rv, P(X = c) = c f (x)dx = 0, hence

Lecture 23. STAT 225 Introduction to Probability Models April 4, Whitney Huang Purdue University. Normal approximation to Binomial

Statistics and Probability

Normal distribution Approximating binomial distribution by normal 2.10 Central Limit Theorem

Random Variable: Definition

E509A: Principle of Biostatistics. GY Zou

Chapter 6: Random Variables and Probability Distributions

4 Random Variables and Distributions

Section 1.3: More Probability and Decisions: Linear Combinations and Continuous Random Variables

Homework: Due Wed, Feb 20 th. Chapter 8, # 60a + 62a (count together as 1), 74, 82

Homework: Due Wed, Nov 3 rd Chapter 8, # 48a, 55c and 56 (count as 1), 67a

Chapter 7 1. Random Variables

Commonly Used Distributions

Distributions and Intro to Likelihood

Business Statistics 41000: Probability 4

Intro to Likelihood. Gov 2001 Section. February 2, Gov 2001 Section () Intro to Likelihood February 2, / 44

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

STA 6166 Fall 2007 Web-based Course. Notes 10: Probability Models

MA : Introductory Probability

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.

UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions.

Topic 6 - Continuous Distributions I. Discrete RVs. Probability Density. Continuous RVs. Background Reading. Recall the discrete distributions

8.1 Binomial Distributions

Probability Theory and Simulation Methods. April 9th, Lecture 20: Special distributions

4.3 Normal distribution

Chapter 4 Continuous Random Variables and Probability Distributions

STA258H5. Al Nosedal and Alison Weir. Winter Al Nosedal and Alison Weir STA258H5 Winter / 41

Central Limit Theorem, Joint Distributions Spring 2018

Statistical Tables Compiled by Alan J. Terry

Tutorial 11: Limit Theorems. Baoxiang Wang & Yihan Zhang bxwang, April 10, 2017

This chapter reviews basic probability concepts that are necessary for the modeling and statistical analysis of financial data.

INF FALL NATURAL LANGUAGE PROCESSING. Jan Tore Lønning, Lecture 3, 1.9

Chapter 4 Continuous Random Variables and Probability Distributions

Continuous random variables

Normal Distribution. Notes. Normal Distribution. Standard Normal. Sums of Normal Random Variables. Normal. approximation of Binomial.

Statistics for Managers Using Microsoft Excel 7 th Edition

Statistics/BioSci 141, Fall 2006 Lab 2: Probability and Probability Distributions October 13, 2006

Lecture Stat 302 Introduction to Probability - Slides 15

A useful modeling tricks.

Favorite Distributions

Lecture 2. Probability Distributions Theophanis Tsandilas

Section 7.1: Continuous Random Variables

Lecture III. 1. common parametric models 2. model fitting 2a. moment matching 2b. maximum likelihood 3. hypothesis testing 3a. p-values 3b.

Engineering Statistics ECIV 2305

Discrete Random Variables and Probability Distributions. Stat 4570/5570 Based on Devore s book (Ed 8)

1 PMF and CDF Random Variable PMF and CDF... 4

Section 7.5 The Normal Distribution. Section 7.6 Application of the Normal Distribution

Statistics & Flood Frequency Chapter 3. Dr. Philip B. Bedient

Chapter 3 Discrete Random Variables and Probability Distributions

ECE 340 Probabilistic Methods in Engineering M/W 3-4:15. Lecture 10: Continuous RV Families. Prof. Vince Calhoun

Random Variables Handout. Xavier Vilà

2011 Pearson Education, Inc

STATISTICAL LABORATORY, May 18th, 2010 CENTRAL LIMIT THEOREM ILLUSTRATION

PROBABILITY DISTRIBUTIONS

18.05 Problem Set 3, Spring 2014 Solutions

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

4. Basic distributions with R

Central Limit Theorem (CLT) RLS

6. Continous Distributions

5.4 Normal Approximation of the Binomial Distribution

Continuous Probability Distributions & Normal Distribution

Chapter 8. Variables. Copyright 2004 Brooks/Cole, a division of Thomson Learning, Inc.

Chapter 3 - Lecture 5 The Binomial Probability Distribution

Discrete Probability Distributions and application in Business

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

NORMAL RANDOM VARIABLES (Normal or gaussian distribution)

2.) What is the set of outcomes that describes the event that at least one of the items selected is defective? {AD, DA, DD}

Chapter 3. Discrete Probability Distributions

Business Statistics 41000: Probability 3

Chapter 6 Continuous Probability Distributions. Learning objectives

Chapter 5: Probability

NORMAL APPROXIMATION. In the last chapter we discovered that, when sampling from almost any distribution, e r2 2 rdrdϕ = 2π e u du =2π.

Probability and distributions

Binomial Random Variables. Binomial Random Variables

Random Variables and Probability Functions

4-1. Chapter 4. Commonly Used Distributions by The McGraw-Hill Companies, Inc. All rights reserved.

Inverse Normal Distribution and Approximation to Binomial

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER

Central Limit Thm, Normal Approximations

Discrete Random Variables

Chapter 5. Discrete Probability Distributions. McGraw-Hill, Bluman, 7 th ed, Chapter 5 1

CIVL Discrete Distributions

Homework Assignments

Outline. 1 Introduction. 2 Sampling distribution of a proportion. 3 Sampling distribution of the mean. 4 Normal approximation to the binomial

Business Statistics. Chapter 5 Discrete Probability Distributions QMIS 120. Dr. Mohammad Zainal

Mathematics of Randomness

ECO220Y Continuous Probability Distributions: Normal Readings: Chapter 9, section 9.10

Elementary Statistics Lecture 5

Section 0: Introduction and Review of Basic Concepts

Chapter 5 Student Lecture Notes 5-1. Department of Quantitative Methods & Information Systems. Business Statistics

IEOR 165 Lecture 1 Probability Review

3. Continuous Probability Distributions

BIOL The Normal Distribution and the Central Limit Theorem

Statistics for Business and Economics

Sampling & populations

5. In fact, any function of a random variable is also a random variable

Transcription:

Review Discrete RV s: prob y fctn: p(x) = Pr(X = x) cdf: F(x) = Pr(X x) E(X) = x x p(x) SD(X) = E { (X - E X) 2 } Binomial(n,p): no. successes in n indep. trials where Pr(success) = p in each trial If X binomial(n,p), then: Pr(X = x) = ( ) n p p x (1 p) n x E(X) = n p SD(X) = np(1 p) 1 Binomial random variable Number of successes in n trials where: Trials independent p = Pr(success) is constant The number of successes in n trials does not necessarily follows a binomial distribution. Deviations from the binomial: Varying p Clumping or repulsion (non-independence) 2

Examples Consider Mendel s pea experiments. (Purple or white flowers; purple dominant to white.) Pick a random F 2. Self it and acquire 10 progeny. The number of progeny with purple flowers is not binomial (unless we condition on the genotype of the F 2 plant). Pick 10 random F 2 s. Self each and take one child from each. The number of progeny with purple flowers is binomial. (p = (1/4) 1 + (1/2) (3/4) + (1/4) 0 = 5/8.) Suppose Pr(survive male) = 10% but Pr(survive female) = 80%. Pick 4 male mice and 6 female mice. The number of survivors is not binomial. Pick 10 random mice (with Pr(mouse is male) = 40%). The number of survivors is binomial. 3 4 males; 6 females Random mice (40% males) 0.30 0.30 0.25 0.25 0.20 0.20 0.15 0.15 0.10 0.10 0.05 0.05 0.00 0 1 2 3 4 5 6 7 8 9 10 no. survivors 0.00 0 1 2 3 4 5 6 7 8 9 10 no. survivors 4

Y = a + b X Suppose X is a discrete random variable with probability function p, so that p(x) = Pr(X = x). Expected value (mean): E(X) = x x p(x) Standard deviation (SD): SD(X) = x [x - E(X)]2 p(x) Let Y = a + b X where a and b are numbers. Then Y is a random variable (like X), and E(Y ) = a + b E(X) SD(Y ) = b SD(X) In particular, if µ = E(X), σ = SD(X), and Z = (X µ) / σ, then E(Z ) = 0 and SD(Z ) = 1 5 Example Suppose X binomial(n, p). (The number of successes in n independent trials where p = Pr(success).) Then E(X) = n p and SD(X) = n p (1 p) Let P = X / n = proportion of successes. E(P) = E(X / n) = E(X) / n = p. SD(P) = SD(X / n) = SD(X) / n =... = p (1 p)/n Toss a fair coin n times and count X = number of heads and P = X/n. For n=50: E(X) = 25, SD(X) 3.5, E(P) = 0.5, SD(P) 0.07 Pr(X = 25) = Pr(P = 0.5) 0.11 For n=5000: E(X) = 2500, SD(X) 35, E(P) = 0.5, SD(P) 0.007 Pr(X = 2500) = Pr(P = 0.5) 0.011 6

Binomial(n=50, p=0.5) Binomial(n=50, p=0.5) / 50 0.12 0.10 0.12 0.10 Probability 0.08 0.06 0.04 Probability 0.08 0.06 0.04 0.02 0.00 0.02 0.00 15 20 25 30 35 x 0.3 0.4 0.5 0.6 0.7 x Binomial(n=5000, p=0.5) Binomial(n=5000, p=0.5) / 5000 0.012 0.010 0.012 0.010 Probability 0.008 0.006 0.004 Probability 0.008 0.006 0.004 0.002 0.000 0.002 0.000 2400 2425 2450 2475 2500 2525 2550 2575 2600 x 0.480 0.485 0.490 0.495 0.5 0.505 0.510 0.515 0.520 x 7 Poisson distribution Consider a binomial(n, p) where n is really large p is really small For example, suppose each well in a microtiter plate contains 50,000 T cells, and that 1/100,000 cells respond to a particular antigen. Let X be the number of responding cells in a well. In this case, X follows a Poisson distribution. Let λ = n p = E(X). Then p(x) = Pr(X = x) = e λ λ x /x! Note that SD(X) = λ. 8

Poisson( λ=1/2 ) Poisson( λ=1 ) 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0 1 2 3 4 5 6 7 8 9 10 11 12 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0 1 2 3 4 5 6 7 8 9 10 11 12 Poisson( λ=2 ) Poisson( λ=4 ) 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0 1 2 3 4 5 6 7 8 9 10 11 12 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0 1 2 3 4 5 6 7 8 9 10 11 12 9 Example Suppose there are 100,000 T cells in each well of a microtiter plate. Suppose that 1/80,000 T cells respond to a particular antigen. Let X = number of responding T cells in a well. X Poisson(λ = 1.25). E(X) = 1.25; SD(X) = 1.25 1.12. Pr(X = 0) = exp( 1.25) 29%. Pr(X > 0) = 1 exp( 1.25) 71%. Pr(X = 2) = exp( 1.25) (1.25) 2 /2 22%. 10

In R The following functions act just like rbinom, dbinom, etc., for the binomial distribution: rpois(m, lambda) dpois(x, lambda) ppois(q, lambda) qpois(p, lambda) 11 Continuous random variables Suppose X is a continuous random variable. Instead of a probability function, X has a probability density function (pdf), sometimes called just the density of X. f(x) 0 f(x) dx = 1 Areas under curve = probabilities Cumulative distr n func n (cdf): F(x) = Pr(X x) = 12

Means and SDs Expected value (mean): Discrete RV: E(X) = x x p(x) Continuous RV: E(X) = x f(x) dx Standard deviation (SD): Discrete RV: SD(X) = x [x - E(X)]2 p(x) Continuous RV: SD(X) = [x - E(X)]2 f(x) dx 13 Example: Uniform distribution X Uniform(a, b) i.e., draw a number at random from the interval (a, b). Density function: f(x) = { 1 b a if a < x < b 0 otherwise height = 1/(b a) a b E(X) = (b+a)/2 SD(X) = (b a)/ 12 0.29 (b a) Cumulative dist n fdn (cdf): height = 1 a b 14

The normal distribution By far the most important distribution: The Normal distribution (also called the Gaussian distribution) If X N(µ, σ), then The pdf of X is f(x) = 1 σ 2π 2( x µ e 1 σ ) 2 Also E(X) = µ and SD(X) = σ. Of great importance: If X N(µ,σ) and Z = (X µ) / σ, Then Z N(0, 1). This is the Standard normal distribution. 15 The normal distribution µ 2σ µ σ µ µ + σ µ + 2σ Pr(µ σ X µ + σ) 68% Pr(µ 2σ X µ + 2σ) 95% 16

The normal CDF Density µ σ µ µ + σ CDF µ σ µ µ + σ 17 Calculations with the normal curve In R: Convert to a statement involving the cdf Use the function pnorm (See also rnorm, dnorm, and qnorm.) With a table: Convert to a statement involving the standard normal Convert to a statement involving the tabulated areas Look up the values in the table Draw a picture! 18

Examples Suppose the heights of adult males in the U.S. are approximately normal distributed, with mean = 69 in and SD = 3 in. What proportion of men are taller than 5 7? X N(µ=69, σ=3) Z = (X 69)/3 N(0,1) 67 69 Pr(X 67) = Pr(Z (67 69)/3) = Pr(Z 2/3) 2/3 0 19 R (or a table) = = 67 69 2/3 2/3 Use either pnorm(2/3) or 1 - pnorm(67, 69, 3) or pnorm(67, 69, 3, lower.tail=false) The answer: 75%. 20

Another calculation What proportion of men are between 5 3 and 6? Pr(63 X 72) = Pr( 2 Z 1) 63 69 72 2 0 1 21 R (or a table) = 2 1 1 2 pnorm(72, 69, 3) - pnorm(63, 69, 3) or pnorm(1) - pnorm(-2) The answer: 82%. 22

One last example Suppose that the measurement error in a laboratory scale follows a normal distribution with mean = 0 mg and SD = 0.1 mg. What is the chance that the absolute error in a single measurement will be greater than 0.15 mg? Pr( X 0.15) = Pr( Z 1.5) 0.15 0 0.15 1.5 0 1.5 23 R (or a table) = 2 1.5 1.5 1.5 2 * pnorm(-0.15, 0, 0.1) or 2 * pnorm(-1.5) The answer: 13%. 24