The Bernoulli distribution

Similar documents
Lecture 23. STAT 225 Introduction to Probability Models April 4, Whitney Huang Purdue University. Normal approximation to Binomial

Normal Distribution. Notes. Normal Distribution. Standard Normal. Sums of Normal Random Variables. Normal. approximation of Binomial.

Chapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi

Binomial Random Variables. Binomial Random Variables

Bernoulli and Binomial Distributions

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

Commonly Used Distributions

Elementary Statistics Lecture 5

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

4.3 Normal distribution

Probability Theory and Simulation Methods. April 9th, Lecture 20: Special distributions

CSE 312 Winter Learning From Data: Maximum Likelihood Estimators (MLE)

Review for Final Exam Spring 2014 Jeremy Orloff and Jonathan Bloom

Chapter 5. Sampling Distributions

Normal distribution Approximating binomial distribution by normal 2.10 Central Limit Theorem

E509A: Principle of Biostatistics. GY Zou

Statistics for Business and Economics

STA258H5. Al Nosedal and Alison Weir. Winter Al Nosedal and Alison Weir STA258H5 Winter / 41

Chapter 4: Asymptotic Properties of MLE (Part 3)

Lecture 8. The Binomial Distribution. Binomial Distribution. Binomial Distribution. Probability Distributions: Normal and Binomial

Learning From Data: MLE. Maximum Likelihood Estimators

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

Conjugate priors: Beta and normal Class 15, Jeremy Orloff and Jonathan Bloom

Chapter 5. Statistical inference for Parametric Models

Lecture Data Science

MVE051/MSG Lecture 7

Unit 5: Sampling Distributions of Statistics

Unit 5: Sampling Distributions of Statistics

Review of the Topics for Midterm I

Statistics 6 th Edition

MATH 3200 Exam 3 Dr. Syring

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER

Lecture 10: Point Estimation

Exam 2 Spring 2015 Statistics for Applications 4/9/2015

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

Probability Theory. Mohamed I. Riffi. Islamic University of Gaza

4-1. Chapter 4. Commonly Used Distributions by The McGraw-Hill Companies, Inc. All rights reserved.

Lecture 9: Plinko Probabilities, Part III Random Variables, Expected Values and Variances

4: Probability. What is probability? Random variables (RVs)

BIO5312 Biostatistics Lecture 5: Estimations

Business Statistics 41000: Probability 3

Normal Distribution. Definition A continuous rv X is said to have a normal distribution with. the pdf of X is

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

AMS 7 Sampling Distributions, Central limit theorem, Confidence Intervals Lecture 4

Definition 9.1 A point estimate is any function T (X 1,..., X n ) of a random sample. We often write an estimator of the parameter θ as ˆθ.

The normal distribution is a theoretical model derived mathematically and not empirically.

Chapter 3 Discrete Random Variables and Probability Distributions

Statistics 511 Supplemental Materials

2011 Pearson Education, Inc

continuous rv Note for a legitimate pdf, we have f (x) 0 and f (x)dx = 1. For a continuous rv, P(X = c) = c f (x)dx = 0, hence

Chapter 3 Discrete Random Variables and Probability Distributions

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS

UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions.

INF FALL NATURAL LANGUAGE PROCESSING. Jan Tore Lønning, Lecture 3, 1.9

Lecture 2. Probability Distributions Theophanis Tsandilas

Introduction to Probability and Inference HSSP Summer 2017, Instructor: Alexandra Ding July 19, 2017

Statistics 431 Spring 2007 P. Shaman. Preliminaries

درس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی

MAS1403. Quantitative Methods for Business Management. Semester 1, Module leader: Dr. David Walshaw

MA : Introductory Probability

Probability. An intro for calculus students P= Figure 1: A normal integral

4 Random Variables and Distributions

Math Week in Review #10. Experiments with two outcomes ( success and failure ) are called Bernoulli or binomial trials.

The binomial distribution p314

The Normal Distribution

The topics in this section are related and necessary topics for both course objectives.

Central limit theorems

CS145: Probability & Computing

Random Variables Handout. Xavier Vilà

Chapter 7 1. Random Variables

Web Science & Technologies University of Koblenz Landau, Germany. Lecture Data Science. Statistics and Probabilities JProf. Dr.

Statistics for Managers Using Microsoft Excel 7 th Edition

Engineering Statistics ECIV 2305

Central Limit Theorem, Joint Distributions Spring 2018

6. Genetics examples: Hardy-Weinberg Equilibrium

Sampling Distribution

Basic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 4: May 2, Abstract

What was in the last lecture?

χ 2 distributions and confidence intervals for population variance

Midterm Exam III Review

Chapter 3 - Lecture 5 The Binomial Probability Distribution

Simulation Wrap-up, Statistics COS 323

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.

Counting Basics. Venn diagrams

Estimating parameters 5.3 Confidence Intervals 5.4 Sample Variance

MATH MW Elementary Probability Course Notes Part IV: Binomial/Normal distributions Mean and Variance

Chapter 8. Introduction to Statistical Inference

Chapter 5. Continuous Random Variables and Probability Distributions. 5.1 Continuous Random Variables

Central Limit Theorem (cont d) 7/28/2006

Chapter 7. Sampling Distributions and the Central Limit Theorem

Lecture 3. Sampling distributions. Counts, Proportions, and sample mean.

Lecture Notes 6. Assume F belongs to a family of distributions, (e.g. F is Normal), indexed by some parameter θ.

Week 7. Texas A& M University. Department of Mathematics Texas A& M University, College Station Section 3.2, 3.3 and 3.4

Standard Normal, Inverse Normal and Sampling Distributions

5. In fact, any function of a random variable is also a random variable

Chapter 7: Point Estimation and Sampling Distributions

Chapter 7. Sampling Distributions and the Central Limit Theorem

Chapter 3 - Lecture 3 Expected Values of Discrete Random Va

Business Statistics 41000: Probability 4

Confidence Intervals Introduction

Transcription:

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this site. Copyright 2006, The Johns Hopkins University and Brian Caffo. All rights reserved. Use of these materials permitted only in accordance with license rights granted. Materials provided AS IS ; no representations or warranties provided. User assumes all responsibility for use, and all liability related thereto, and must independently review all materials for accuracy and efficacy. May contain materials owned by others. User is responsible for obtaining permissions for use from third parties as needed.

Outline 1. Define the Bernoulli distrubtion 2. Define Bernoulli likelihoods 3. Define the Binomial distribution 4. Define Binomial likelihoods 5. Define the normal distribution 6. Define normal likelihoods

The Bernoulli distribution The Bernoulli distribution arises as the result of a binary outcome Bernoulli random variables take (only) the values 1 and 0 with a probabilities of (say) p and 1 p respectively The PMF for a Bernoulli random variable X is P(X = x) = p x (1 p) 1 x The mean of a Bernoulli random variable is p and the variance is (1 p) If we let X be a Bernoulli random variable, it is typical to call X = 1 as a success and X = 0 as a failure

iid Bernoulli trials If several iid Bernouli observations, say x 1,..., x n, are observed the likelihood is n p x i (1 p) 1 x i = p xi (1 p) n x i i=1 Notice that the likelihood depends only on the sum of the x i Because n is fixed and assumed known, this implies that the sample proportion i x i/n contains all of the relevant information about p We can maximize the Bernoulli likelihood over p to obtain that ˆp = i x i/n is the maximum likelihood estimator for p

0.0 0.2 0.4 0.6 0.8 1.0 p likelihood 0.0 0.2 0.4 0.6 0.8 1.0

Binomial trials The binomial random variables are obtained as the sum of iid Bernoulli trials In specific, let X 1,..., X n be iid Bernoulli(p); then X = ni=1 X i is a binomial random variable The binomial mass function is for x = 0,..., n P(X = x) = ( n x ) p x (1 p) n x

Recall that the notation ( n x ) = n! x!(n x)! (read n choose x ) counts the number of ways of selecting x items out of n without replacement disregarding the order of the items ( n ) ( n ) 0 = n = 1

Justification of the binomial likelihood Consider the probability of getting 6 heads out of 10 coin flips from a coin with success probability p The probability of getting 6 heads and 4 tails in any specific order is p 6 (1 p) 4 There are ( 10 ) possible orders of 6 heads and 4 tails 6

Example Suppose a friend has 8 children, 7 of which are girls and none are twins If each gender has a 50% probability in each birth, what s the probability of getting 7 or more girls out of 8 births? ( 8 ) 7.5 7 (1.5) 8 + ( 8 ) 8.5 8 (1.5) 0.004 This calculation is an example of a Pvalue - the probability under a null hypothesis of getting a result as extreme or more extreme than the one actually obtained

0.0 0.2 0.4 0.6 0.8 1.0 p Likelihood 0.0 0.2 0.4 0.6 0.8 1.0

The normal distribution A random variable is said to follow a normal or Gaussian distribution with mean µ and variance σ 2 if the associated density is (2πσ 2 ) 1/2 e (x µ)2 /2σ 2 If X a RV with this density then E[X] = µ and Var(X) = σ 2 We write X N(µ, σ 2 ) When µ = 0 and σ = 1 the resulting distribution is called the standard normal distribution The standard normal density function is labeled φ Standard normal RVs are often labeled Z

3 2 1 0 1 2 3 z density 0.0 0.1 0.2 0.3 0.4

Facts about the normal density If X N(µ, σ) the Z = X µ σ If Z is standard normal is standard normal X = µ + σz N(µ, σ) The non-standard normal density is φ{(x µ)/σ}/σ

More facts about the normal density 1. Approximately 68%, 95% and 99% of the normal density lies within 1, 2 and 3 standard deviations from the mean, respectively 2. 1.28, 1.645, 1.96 and 2.33 are the 10 th, 5 th, 2.5 th and 1 st percentiles of the standard normal distribution respectively 3. By symmetry, 1.28, 1.645, 1.96 and 2.33 are the 90 th, 95 th, 97.5 th and 99 th percentiles of the standard normal distribution respectively

Question What is the 95 th percentile of a N(µ, σ) distribution? We want the point x 0 so that P(X x 0 ) =.95 Therefore or x 0 = µ + σ1.96 P(X x 0 ) = P = P x 0 µ σ ( X µ x ) 0 µ σ σ ( Z x ) 0 µ =.95 σ = 1.96 In general x 0 = µ+σz 0 where z 0 is the appropriate standard normal quantile

Question What is the probability that a N(µ, σ) RV is 2 standard deviations above the mean? We want to know P(X > µ + 2σ) = P ( X µ σ = P(Z 2) > µ + 2σ µ ) σ 2.5%

Other properties 1. The normal distribution is symmetric and peaked about its mean (therefore the mean, median and mode are all equal) 2. A constant times a normally distributed random variable is also normal distributed random variable (what is the mean and variance?) 3. Sums of normally distributed random variables are again normally distributed even if the variables are dependent (what is the mean and variance?) 4. Sample means of normally distributed random variables are again normally distributed (with what mean and variance?)

5. The square of a standard normal random variable follows what is called chi-squared distribution 6. The exponent of a normally distributed random variables follows what is called the log-normal distribution 7. As we will see later, many random variables, properly normalized, limit to a normal distribution

Question If X i are iid N(µ, σ 2 ) with a known variance, what is the likelihood for µ? L(µ) = n (2πσ 2 ) 1/2 exp { (x i µ) 2 /2σ 2} i=1 n exp (x i µ) 2 /2σ 2 i=1 n = exp n x 2 i /2σ2 + µ X i /σ 2 nµ 2 /2σ 2 i=1 i=1 exp {µn x/σ 2 nµ 2 /2σ 2} Later we will discuss methods for handling the unknown variance

Question If X i are iid N(µ, σ 2 ), with known variance what s the ML estimate of µ? We calculated the likelihood for µ on the previous page, the log likelihood is The derivative WRT µ is µn x/σ 2 nµ 2 /2σ 2 n x/σ 2 nµ/σ 2 = 0 This yields that x is the ml estimate of µ Since this doesn t depend on σ it is also the ML estimate with σ unknown

Final thoughts on normal likelihoods The maximum likelihood estimate for σ 2 is ni=1 (X i X) 2 n Which is the biased version of the sample variance The ML estimate of σ is simply the square root of this estimate To do likelihood inference, the bivariate likelihood of (µ, σ) is difficult to visualize Later, we will discuss methods for constructing likelihoods for one parameter at a time