The Binomial Distribution

Similar documents
The Binomial Distribution

The Binomial Distribution

Random variables The binomial distribution The normal distribution Other distributions. Distributions. Patrick Breheny.

Random variables The binomial distribution The normal distribution Sampling distributions. Distributions. Patrick Breheny.

Probability mass function; cumulative distribution function

The Central Limit Theorem

II. Random Variables

LECTURE CHAPTER 3 DESCRETE RANDOM VARIABLE

MA : Introductory Probability

UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions.

Binomial Distribution and Discrete Random Variables

Probability Distributions: Discrete

Chapter 3 Discrete Random Variables and Probability Distributions

Probability Theory and Simulation Methods. April 9th, Lecture 20: Special distributions

Math489/889 Stochastic Processes and Advanced Mathematical Finance Homework 5

chapter 13: Binomial Distribution Exercises (binomial)13.6, 13.12, 13.22, 13.43

. (i) What is the probability that X is at most 8.75? =.875

Probability Theory. Mohamed I. Riffi. Islamic University of Gaza

Some Discrete Distribution Families

4 Random Variables and Distributions

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

Discrete Random Variables and Probability Distributions. Stat 4570/5570 Based on Devore s book (Ed 8)

The Binomial Probability Distribution

Binomial Distributions

Probability Theory. Probability and Statistics for Data Science CSE594 - Spring 2016

Stat 20: Intro to Probability and Statistics

Probability & Statistics Chapter 5: Binomial Distribution

Data Science Essentials

MATH 446/546 Homework 1:

Module 4: Probability

Probability. An intro for calculus students P= Figure 1: A normal integral

Binomial and Normal Distributions

2011 Pearson Education, Inc

E509A: Principle of Biostatistics. GY Zou

***SECTION 8.1*** The Binomial Distributions

Lecture Data Science

Chapter 7 1. Random Variables

LESSON 9: BINOMIAL DISTRIBUTION

Probability and distributions

AP Statistics Ch 8 The Binomial and Geometric Distributions

The Binomial Distribution

The Binomial Distribution

MATH 264 Problem Homework I

Math 180A. Lecture 5 Wednesday April 7 th. Geometric distribution. The geometric distribution function is

The normal distribution is a theoretical model derived mathematically and not empirically.

Chapter 8. Variables. Copyright 2004 Brooks/Cole, a division of Thomson Learning, Inc.

6 If and then. (a) 0.6 (b) 0.9 (c) 2 (d) Which of these numbers can be a value of probability distribution of a discrete random variable

Stochastic Calculus, Application of Real Analysis in Finance

Probability & Sampling The Practice of Statistics 4e Mostly Chpts 5 7

Chapter 8: Binomial and Geometric Distributions

STOR Lecture 7. Random Variables - I

Basic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 4: May 2, Abstract

Random Variables Handout. Xavier Vilà

Statistical Methods in Practice STAT/MATH 3379

Lecture 23. STAT 225 Introduction to Probability Models April 4, Whitney Huang Purdue University. Normal approximation to Binomial

4: Probability. What is probability? Random variables (RVs)

CS 237: Probability in Computing

Chapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi

Probability Distributions for Discrete RV

Engineering Statistics ECIV 2305

From Discrete Time to Continuous Time Modeling

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:

Lesson 97 - Binomial Distributions IBHL2 - SANTOWSKI

Opening Exercise: Lesson 91 - Binomial Distributions IBHL2 - SANTOWSKI

4.2 Bernoulli Trials and Binomial Distributions

CS 237: Probability in Computing

Chapter 4. Section 4.1 Objectives. Random Variables. Random Variables. Chapter 4: Probability Distributions

Introduction to Probability and Inference HSSP Summer 2017, Instructor: Alexandra Ding July 19, 2017

3.2 Binomial and Hypergeometric Probabilities

CS134: Networks Spring Random Variables and Independence. 1.2 Probability Distribution Function (PDF) Number of heads Probability 2 0.

Chapter 5. Sampling Distributions

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals

Ordering a deck of cards... Lecture 3: Binomial Distribution. Example. Permutations & Combinations

Chapter 7: Point Estimation and Sampling Distributions

Lecture 34. Summarizing Data

Mathacle. PSet Stats, Concepts In Statistics Level Number Name: Date: Distribution Distribute in anyway but normal

ECON 214 Elements of Statistics for Economists 2016/2017

Section M Discrete Probability Distribution

Lab #7. In previous lectures, we discussed factorials and binomial coefficients. Factorials can be calculated with:

Sampling Distributions and the Central Limit Theorem

Random Variables CHAPTER 6.3 BINOMIAL AND GEOMETRIC RANDOM VARIABLES

Business Statistics 41000: Probability 3

Commonly Used Distributions

Conjugate priors: Beta and normal Class 15, Jeremy Orloff and Jonathan Bloom

Lecture Notes 6. Assume F belongs to a family of distributions, (e.g. F is Normal), indexed by some parameter θ.

STAT Mathematical Statistics

Discrete Random Variables (Devore Chapter Three)

Model Paper Statistics Objective. Paper Code Time Allowed: 20 minutes

Chapter 7 presents the beginning of inferential statistics. The two major activities of inferential statistics are

CS 361: Probability & Statistics

The Binomial distribution

Chapter 4 Continuous Random Variables and Probability Distributions

14.30 Introduction to Statistical Methods in Economics Spring 2009

4: Probability. Notes: Range of possible probabilities: Probabilities can be no less than 0% and no more than 100% (of course).

Lecture 9: Plinko Probabilities, Part III Random Variables, Expected Values and Variances

Chapter 3 - Lecture 5 The Binomial Probability Distribution

Probability Distributions II

Spike Statistics: A Tutorial

3.2 Hypergeometric Distribution 3.5, 3.9 Mean and Variance

6.1 Discrete and Continuous Random Variables. 6.1A Discrete random Variables, Mean (Expected Value) of a Discrete Random Variable

Transcription:

Patrick Breheny September 13 Patrick Breheny University of Iowa Biostatistical Methods I (BIOS 5710) 1 / 16

Outcomes and summary statistics Random variables Distributions So far, we have discussed the probability of events In most studies, however, it is usually easier to work with a summary statistic than the actual sample space For example, in the polio study, the relevant information in the study can be summarized by the number of people who contracted polio; this is vastly easier to think about than all possible outcomes of all possible samples that could be drawn from the population Patrick Breheny University of Iowa Biostatistical Methods I (BIOS 5710) 2 / 16

Random variables Random variables and distributions Random variables Distributions A numerical summary X of an outcome is called a random variable More formally, a random variable is a function mapping the sample space S to the real numbers R Random variable Possible outcomes # of copies of a genetic mutation 0,1,2 # of children a woman will have in her lifetime 0,1,2,... # of people in a sample who contract polio 0,1,2,...,n Patrick Breheny University of Iowa Biostatistical Methods I (BIOS 5710) 3 / 16

Distributions Random variables and distributions Random variables Distributions Once the random process is complete, we observe a certain value of a random variable In order to make inferences, we need to know the chances that our random variable could have taken on different values depending on the true values of the population parameters This is called a distribution A distribution describes the probability that a random variable will take on a specific value or fall within a specific range of values Patrick Breheny University of Iowa Biostatistical Methods I (BIOS 5710) 4 / 16

Distribution: technical definition Random variables Distributions Definition: Given a random variable X and probability function P defined on a sample space, the distribution (or law) of X is a function that, for a given interval B, gives P (X B) To be clear, B can be any open or closed interval: [4, 5], (4, 5], [4, ), etc. B can also be a single point, such as [5, 5] Patrick Breheny University of Iowa Biostatistical Methods I (BIOS 5710) 5 / 16

Cumulative distribution function Random variables Distributions Definition: The cumulative distribution function (CDF) of a random variable X is F (x) = P (X x) Note that X is the random variable and x is the (constant) argument of the function Note that the distribution uniquely defines the CDF by setting B = (, x] Less obviously, the CDF uniquely defines the distribution; for example, P (X [L, U]) = F (U) lim x L F (x) Patrick Breheny University of Iowa Biostatistical Methods I (BIOS 5710) 6 / 16

Probability mass function Random variables Distributions If X is discrete (i.e., takes on only a finite or countable number of values), we can also describe point probabilities: Definition: The probability mass function (PMF) of a random variable X is given by f(x) = P (X = x) Again, X is the random variable and x is the argument It is easy to see in this case that there is a one-to-one relationship between PMFs and CDFs; for example, F (x) = s x f(s) A common convention is to use an uppercase letter for a CDF and the lower case letter for its PMF Patrick Breheny University of Iowa Biostatistical Methods I (BIOS 5710) 7 / 16

Probability density functions Random variables Distributions If X is continuous (in the sense that F (x) is a continuous function), the PMF is not useful since f(x) = 0 x; in this case, we need to introduce the concept of probability density : Definition: The probability density function (PDF) of a random variable X is given by f(x) = d dx F (x) Although P (X = x) = 0, we can still talk about the density (probability per infinitesimally small area) at a point x A similar relationship again holds between PDF and CDF: F (x) = f(s)ds s x Patrick Breheny University of Iowa Biostatistical Methods I (BIOS 5710) 8 / 16

Listing the ways Random variables and distributions The binomial coefficients The binomial distribution The most straightforward way of figuring out the probability of something is to list all the elements of the sample space If all the ways are equally likely, then each one has probability, where n is the total number of ways 1 n Thus, the probability of the event is the number of ways it can happen divided by n Patrick Breheny University of Iowa Biostatistical Methods I (BIOS 5710) 9 / 16

Coin example Random variables and distributions The binomial coefficients The binomial distribution For example, suppose we flip a coin three times; what is the probability that exactly one of the flips was heads? Possible outcomes: HHH HHT HT H HT T T HH T HT T T H T T T The probability is therefore 3/8 Patrick Breheny University of Iowa Biostatistical Methods I (BIOS 5710) 10 / 16

The binomial coefficients The binomial coefficients The binomial distribution Listing all the elements of the sample space is often impractical, however (imagine listing the outcomes involved in flipping a coin 100 times) Luckily, when there are only two possible outcomes, we can apply the following theorem: Binomial theorem: For a binary process repeated n times, the number of sequences in which one outcome occurs k times is ( ) n = k n! k!(n k)! ; these numbers are known as the binomial coefficients Patrick Breheny University of Iowa Biostatistical Methods I (BIOS 5710) 11 / 16

The binomial coefficients The binomial distribution When sequences are not equally likely Suppose we draw 3 balls, with replacement, from an urn that contains 10 balls: 2 red balls and 8 green balls What is the probability that we will draw two red balls? As before, there are three possible sequences: RRG, RGR, and GRR, but the sequences no longer have probability 1 8 Patrick Breheny University of Iowa Biostatistical Methods I (BIOS 5710) 12 / 16

The binomial coefficients The binomial distribution When sequences are not equally likely (cont d) Instead, the probability of each sequence is 2 10 2 10 8 10 = 2 10 8 10 2 10 = 8 10 2 10 Thus, the probability of drawing two red balls is 3 2 10 2 10 8 10 = 9.6% 2 10.03 Patrick Breheny University of Iowa Biostatistical Methods I (BIOS 5710) 13 / 16

The binomial formula The binomial coefficients The binomial distribution We can summarize this result into the following formula: Theorem: Given a sequence of n independent events that occur with probability θ, the probability that an event will occur x times is n! x!(n x)! θx (1 θ) n x Letting X denote the number of times the event occurs, the above yields the PMF of X and therefore defines a specific distribution This distribution is called the binomial distribution, and X is said to follow a binomial distribution or to be binomially distributed Patrick Breheny University of Iowa Biostatistical Methods I (BIOS 5710) 14 / 16

Example Random variables and distributions The binomial coefficients The binomial distribution According to the CDC, 18% of the adults in the United States smoke Example: Suppose we sample 10 people; what is the probability that 5 of them will smoke? Example: Suppose we sample 10 people; what is the probability that 2 or fewer will smoke? Patrick Breheny University of Iowa Biostatistical Methods I (BIOS 5710) 15 / 16

Random variables and distributions Definitions: random variable, distribution, cumulative distribution function, probability mass function, probability density function Binomial coefficients: Binomial distribution: ( ) n = k P (X = x) = n! k!(n k)! ( ) n θ x (1 θ) n x x Patrick Breheny University of Iowa Biostatistical Methods I (BIOS 5710) 16 / 16