Binomial Distribution and Discrete Random Variables

Similar documents
4.3 Normal distribution

Normal distribution Approximating binomial distribution by normal 2.10 Central Limit Theorem

The Binomial Distribution

x is a random variable which is a numerical description of the outcome of an experiment.

4 Random Variables and Distributions

Estimating parameters 5.3 Confidence Intervals 5.4 Sample Variance

UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions.

Probability Distributions: Discrete

MA : Introductory Probability

Discrete Random Variables and Probability Distributions. Stat 4570/5570 Based on Devore s book (Ed 8)

Probability Theory. Mohamed I. Riffi. Islamic University of Gaza

STOR Lecture 7. Random Variables - I

3.2 Hypergeometric Distribution 3.5, 3.9 Mean and Variance

Chapters 1 2 Discrete random variables Permutations Binomial and related distributions Expected value and variance

The Binomial Probability Distribution

AP Statistics Ch 8 The Binomial and Geometric Distributions

CS 237: Probability in Computing

Lecture 23. STAT 225 Introduction to Probability Models April 4, Whitney Huang Purdue University. Normal approximation to Binomial

Probability. An intro for calculus students P= Figure 1: A normal integral

4.2 Bernoulli Trials and Binomial Distributions

Binomial Random Variables. Binomial Random Variables

MATH 118 Class Notes For Chapter 5 By: Maan Omran

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

Discrete Random Variables

Math489/889 Stochastic Processes and Advanced Mathematical Finance Homework 5

LECTURE CHAPTER 3 DESCRETE RANDOM VARIABLE

Chapter 3 Discrete Random Variables and Probability Distributions

The normal distribution is a theoretical model derived mathematically and not empirically.

Central Limit Theorem, Joint Distributions Spring 2018

Bernoulli and Binomial Distributions

Statistics 6 th Edition

LESSON 9: BINOMIAL DISTRIBUTION

Chapter 5 Probability Distributions. Section 5-2 Random Variables. Random Variable Probability Distribution. Discrete and Continuous Random Variables

Probability mass function; cumulative distribution function

STAT Mathematical Statistics

Chapter 7 1. Random Variables

Class 12. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Probability Basics. Part 1: What is Probability? INFO-1301, Quantitative Reasoning 1 University of Colorado Boulder. March 1, 2017 Prof.

Markov Chains (Part 2)

Section 7.5 The Normal Distribution. Section 7.6 Application of the Normal Distribution

Homework: Due Wed, Nov 3 rd Chapter 8, # 48a, 55c and 56 (count as 1), 67a

Intro to Probability Instructor: Alexandre Bouchard

Chapter 5. Discrete Probability Distributions. Random Variables

Chapter 4 Continuous Random Variables and Probability Distributions

Chapter 6: Random Variables. Ch. 6-3: Binomial and Geometric Random Variables

Discrete Random Variables; Expectation Spring 2014

guessing Bluman, Chapter 5 2

Random Variables Handout. Xavier Vilà

Conditional Probability. Expected Value.

Experimental Probability - probability measured by performing an experiment for a number of n trials and recording the number of outcomes

Statistical Methods in Practice STAT/MATH 3379

The Binomial Distribution

continuous rv Note for a legitimate pdf, we have f (x) 0 and f (x)dx = 1. For a continuous rv, P(X = c) = c f (x)dx = 0, hence

Probability Distributions: Discrete

Probability Distributions for Discrete RV

CS145: Probability & Computing

4: Probability. Notes: Range of possible probabilities: Probabilities can be no less than 0% and no more than 100% (of course).

A useful modeling tricks.

MATH 112 Section 7.3: Understanding Chance

Chapter 4 Continuous Random Variables and Probability Distributions

MATH1215: Mathematical Thinking Sec. 08 Spring Worksheet 9: Solution. x P(x)

4-1. Chapter 4. Commonly Used Distributions by The McGraw-Hill Companies, Inc. All rights reserved.

Discrete Random Variables

Binomial and Normal Distributions. Example: Determine whether the following experiments are binomial experiments. Explain.

STAT Chapter 5: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.

23.1 Probability Distributions

II. Random Variables

MAKING SENSE OF DATA Essentials series

Statistics Class 15 3/21/2012

From Discrete Time to Continuous Time Modeling

A continuous random variable is one that can theoretically take on any value on some line interval. We use f ( x)

Chapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi

The Binomial and Geometric Distributions. Chapter 8

Chapter ! Bell Shaped

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:

4: Probability. What is probability? Random variables (RVs)

CHAPTER 4 DISCRETE PROBABILITY DISTRIBUTIONS

Stat511 Additional Materials

Chapter Learning Objectives. Discrete Random Variables. Chapter 3: Discrete Random Variables and Probability Distributions.

Probability Theory and Simulation Methods. April 9th, Lecture 20: Special distributions

Basic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 4: May 2, Abstract

15.063: Communicating with Data Summer Recitation 3 Probability II

CS134: Networks Spring Random Variables and Independence. 1.2 Probability Distribution Function (PDF) Number of heads Probability 2 0.

Statistical Methods for NLP LT 2202

VIDEO 1. A random variable is a quantity whose value depends on chance, for example, the outcome when a die is rolled.

Convergence. Any submartingale or supermartingale (Y, F) converges almost surely if it satisfies E Y n <. STAT2004 Martingale Convergence

Business Statistics 41000: Probability 3

Chapter 4 and 5 Note Guide: Probability Distributions

MAS1403. Quantitative Methods for Business Management. Semester 1, Module leader: Dr. David Walshaw

Chapter 4 Random Variables & Probability. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

2011 Pearson Education, Inc

Probability Distributions II

Normal Distribution. Notes. Normal Distribution. Standard Normal. Sums of Normal Random Variables. Normal. approximation of Binomial.

6.1 Binomial Theorem

Discrete Random Variables and Probability Distributions

Binomial and multinomial distribution

Binomial Distributions

STAT/MATH 395 PROBABILITY II

Introduction to Probability and Inference HSSP Summer 2017, Instructor: Alexandra Ding July 19, 2017

6 If and then. (a) 0.6 (b) 0.9 (c) 2 (d) Which of these numbers can be a value of probability distribution of a discrete random variable

Transcription:

3.1 3.3 Binomial Distribution and Discrete Random Variables Prof. Tesler Math 186 Winter 2017 Prof. Tesler 3.1 3.3 Binomial Distribution Math 186 / Winter 2017 1 / 16

Random variables A random variable X is a function assigning a real number to each outcome in a sample space. A biased coin has probability p of heads, q = 1 p of tails. Flip the coin 3 times and let X denote the number of heads: X(HHH) = 3 X(HHT) = X(HTH) = X(THH) = 2 X(TTT) = 0 X(HTT) = X(THT) = X(TTH) = 1 The range of X is {0, 1, 2, 3}. The discrete probability density function (pdf) is (k) = P(X = k): (0) = q 3 (1) = 3pq 2 (2) = 3p 2 q (3) = p 3 (k) is defined for all real numbers k. In this case, (k) = 0 for k 0, 1, 2, 3: (4) = 0 (2.5) = 0 ( 3) = 0 (π) = 0... Prof. Tesler 3.1 3.3 Binomial Distribution Math 186 / Winter 2017 2 / 16

Discrete random variables In the preceding example, the range of X is a discrete set, not a continuum (such as the real number interval [0, 3]). So X is a discrete random variable. Sometimes it s called a probability mass function (pmf) in the discrete case, vs. a probability density function (pdf) in the continuous case. We ll use probability density function for both. Notation (k) = P(X = k): Use capital letters (X) for random variables and lowercase (k) to stand for numeric values. A discrete probability density function requires (k) 0 for all k, and that the total probability is k p (k) = 1. On the previous slide: X (k) = (0) + (1) + (2) + (3) k = q 3 + 3pq 2 + 3p 2 q + p 3 = (q + p) 3 = 1 3 = 1 Prof. Tesler 3.1 3.3 Binomial Distribution Math 186 / Winter 2017 3 / 16

Binomial distribution A biased coin has probability p of heads, q = 1 p of tails. Flip the coin 7 times. P(HHTHTTH) = ppqpqqp = p 4 q 3 = p # heads q # tails P(4 heads in 7 flips) = ( ) 7 4 p 4 q 3 Flip the coin n times (n = 0, 1, 2, 3,...). Let X be the number of heads. The probability density function (pdf) of X is {( n ) (k) = P(X = k) = k p k q n k if k = 0, 1,..., n; 0 otherwise. Interpretation: Repeat this experiment (flipping a coin n times and counting the heads) a huge number of times. The fraction of experiments with X = k will be approximately (k). Prof. Tesler 3.1 3.3 Binomial Distribution Math 186 / Winter 2017 4 / 16

Binomial distribution {( n ) (k) = P(X = k) = k p k q n k if k = 0, 1,..., n; 0 otherwise. The range of X is {0, 1, 2,..., n}. (k) 0 for all values k. The sum of all probability densities is 1: n k=0 ( ) n p k q n k = (p + q) n = 1 n = 1 k The relationship to the binomial formula is why it s named the binomial distribution. Prof. Tesler 3.1 3.3 Binomial Distribution Math 186 / Winter 2017 5 / 16

Genetics example Consider pea plants from a Tt Tt cross. The offspring have Genotype Probability Phenotype TT 1/4 tall Tt 1/2 tall tt 1/4 short so the phenotypes have P(tall) = 3/4, P(short) = 1/4. If there are 10 offspring, the number X of tall offspring has a binomial distribution with n = 10, p = 3/4: (k) = P(X = k) = {( 10 k ) (3/4) k (1/4) 10 k if k = 0, 1,..., 10; 0 otherwise. Later: We will see other bioinformatics applications that use the binomial distribution, including genome assembly and Haldane s model of recombination. Prof. Tesler 3.1 3.3 Binomial Distribution Math 186 / Winter 2017 6 / 16

Binomial distribution for n = 10, p = 3/4 k pdf 1 Discrete probability density function 0 0.00000095 1 0.00002861 0.8 2 0.00038624 3 0.00308990 0.6 4 0.01622200 5 0.05839920 6 0.14599800 0.4 7 0.25028229 8 0.28156757 0.2 9 0.18771172 10 0.05631351 0 other 0 0 5 10 k (k) Prof. Tesler 3.1 3.3 Binomial Distribution Math 186 / Winter 2017 7 / 16

Cumulative Distribution Function (cdf) The Cumulative Distribution Function (cdf) of random variable X is defined over all real numbers k. In our example, (k) = P(X k) (1)= P(X 1) = (0) + (1) = 0.00000095 + 0.00002861 = 0.00002956 (2)= P(X 2) = (0) + (1) + (2) = 0.00000095 + 0.00002861 + 0.00038624 = 0.00041580 Alternately: = (1) + (2) =.00002956 + 0.00038624 = 0.00041580 Prof. Tesler 3.1 3.3 Binomial Distribution Math 186 / Winter 2017 8 / 16

CDF in-between points with nonzero probability Note that (1.5) = P(X 1.5) = (0) + (1) = (1) The binomial distribution has nonzero probability only at integers. In-between integers, PDF: (k) = 0 CDF: (k) = ( k ), where k is the floor of k (largest integer k): 3 = 3, 3 = 3, 3.2 = 3, 3.2 = 4. Warning Be careful, this is just our first example. If the range of a random variable includes non-integer locations, go down to the largest value k with nonzero probability instead of to k. Prof. Tesler 3.1 3.3 Binomial Distribution Math 186 / Winter 2017 9 / 16

CDF outside of the range In this example, the range of X is {0, 1,..., 10}. ( 3.2) = P(X 3.2) = 0 since minimum X in range is 0. (12.8) = P(X 12.8) = 1 since the whole range is 12.8. This example has a bounded range. (k) = 0 below the range and (k) = 1 above the range. But not all random variables have a bounded range. Instead, for any random variable, we have asymptotic results: lim F k X (k) = 0 lim F k + X (k) = 1 As k goes from to, the cdf weakly increases. For a discrete random variable, the cdf jumps where the pdf is nonzero. Prof. Tesler 3.1 3.3 Binomial Distribution Math 186 / Winter 2017 10 / 16

Binomial distribution for n = 10, p = 3/4 k pdf (k) cdf (k) k < 0 0 0 0.00000095 0 k < 1 0.00000095 1 0.00002861 1 k < 2 0.00002956 2 0.00038624 2 k < 3 0.00041580 3 0.00308990 3 k < 4 0.00350571 4 0.01622200 4 k < 5 0.01972771 5 0.05839920 5 k < 6 0.07812691 6 0.14599800 6 k < 7 0.22412491 7 0.25028229 7 k < 8 0.47440720 8 0.28156757 8 k < 9 0.75597477 9 0.18771172 9 k < 10 0.94368649 10 0.05631351 10 k 1.00000000 other 0 1 Discrete probability density function 1 Cumulative distribution function 0.8 0.8 (k) 0.6 0.4 (k) 0.6 0.4 0.2 0.2 0 0 0 5 10 0 5 10 k k Prof. Tesler 3.1 3.3 Binomial Distribution Math 186 / Winter 2017 11 / 16

Using pdf and cdf table (binomial n = 10, p = 3/4) Different inequality symbols, >, <, k pdf (k) cdf (k) k < 0 0 0 0.00000095 0 k < 1 0.00000095 1 0.00002861 1 k < 2 0.00002956 2 0.00038624 2 k < 3 0.00041580 3 0.00308990 3 k < 4 0.00350571 4 0.01622200 4 k < 5 0.01972771 5 0.05839920 5 k < 6 0.07812691 6 0.14599800 6 k < 7 0.22412491 7 0.25028229 7 k < 8 0.47440720 8 0.28156757 8 k < 9 0.75597477 9 0.18771172 9 k < 10 0.94368649 10 0.05631351 10 k 1.00000000 other 0 P(X 2) = 0.00041580 P(X > 2) = 1 P(X 2) = 1 0.00041580 = 0.99958420 P(X < 2) = P(X 2 ) = (2 ) = 0.00002956 using infinitesimal notation from Calculus: 2 is just below 2. P(X 2) = 1 P(X < 2) = 1 (2 ) = 0.99997044 Prof. Tesler 3.1 3.3 Binomial Distribution Math 186 / Winter 2017 12 / 16

Using pdf and cdf table (binomial n = 10, p = 3/4) Probability of an interval k pdf (k) cdf (k) k < 0 0 0 0.00000095 0 k < 1 0.00000095 1 0.00002861 1 k < 2 0.00002956 2 0.00038624 2 k < 3 0.00041580 3 0.00308990 3 k < 4 0.00350571 4 0.01622200 4 k < 5 0.01972771 5 0.05839920 5 k < 6 0.07812691 6 0.14599800 6 k < 7 0.22412491 7 0.25028229 7 k < 8 0.47440720 8 0.28156757 8 k < 9 0.75597477 9 0.18771172 9 k < 10 0.94368649 10 0.05631351 10 k 1.00000000 other 0 (4) = P(X 4) = (0) + (1) + (2) + (3) + (4) (2) = P(X 2) = (0) + (1) + (2) P(2 < X 4) = (3) + (4) = P(X 4) P(X 2) = (4) (2) = 0.01972771 0.00041580 = 0.01931191 Prof. Tesler 3.1 3.3 Binomial Distribution Math 186 / Winter 2017 13 / 16

Using pdf and cdf table (binomial n = 10, p = 3/4) Converting other inequalities to the form P(a < X b) k pdf (k) cdf (k) k < 0 0 0 0.00000095 0 k < 1 0.00000095 1 0.00002861 1 k < 2 0.00002956 2 0.00038624 2 k < 3 0.00041580 3 0.00308990 3 k < 4 0.00350571 4 0.01622200 4 k < 5 0.01972771............ The formula P(a < X b) = (b) (a) uses a < X (not a X) and X b (not X < b). Other formats must be converted to this. P(2 < X 4) = P(X 4) P(X 2) = (4) (2) = 0.01972771 0.00041580 = 0.01931191 P(2 X 4) = P(2 < X 4) = (4) (2 ) = 0.01972771 0.00002956 = 0.01969815 P(2 < X < 4) = P(2 < X 4 ) = (4 ) (2) = 0.00350571 0.00041580 = 0.00308991 P(2 X < 4) = P(2 < X 4 ) = (4 ) (2 ) = 0.00350571 0.00002956 = 0.00347615 Prof. Tesler 3.1 3.3 Binomial Distribution Math 186 / Winter 2017 14 / 16

Using pdf and cdf table Probability of an interval for integer random variables Summary: To compute the probability of an interval, convert one-sided inequalities to P(X b) = (b) and two-sided inequalities to P(a < X b) = (b) (a). We did the conversion with infinitesimals: P(X < 2) = P(X 2 ) = (2 ) = 0.00002956. Another method: The binomial distribution X only has integer values, so P(X < b) = P(X b 1) for any integer b. Don t use this method when non-integer values are possible. P(X < 2) = P(X 1) = (1) = 0.00002956 P(2 X 4) = P(1 < X 4) = (4) (1) = 0.01972771 0.00002956 = 0.01969815 P(2 < X < 4) = P(2 < X 3) = (3) (2) = 0.00350571 0.00041580 = 0.00308991 Prof. Tesler 3.1 3.3 Binomial Distribution Math 186 / Winter 2017 15 / 16

Discrete is not equivalent to integer! New example, not the same as the previous example: Suppose the range of Y is {0.0, 0.1, 0.2,..., 9.9, 10.0}. This range is not integers, but is discrete. Don t convert P(Y < a) into P(Y a 1). Instead, convert it to P(Y b), where b is the largest element below a that s in the range. P(Y < 2) = P(Y 1.9) P(2 Y 4) = P(1.9 < Y 4) = F Y (4) F Y (1.9) Prof. Tesler 3.1 3.3 Binomial Distribution Math 186 / Winter 2017 16 / 16