Intro to Likelihood. Gov 2001 Section. February 2, Gov 2001 Section () Intro to Likelihood February 2, / 44

Similar documents
Distributions and Intro to Likelihood

Stochastic Components of Models

Review. Binomial random variable

GOV 2001/ 1002/ E-200 Section 3 Inference and Likelihood

Probability. An intro for calculus students P= Figure 1: A normal integral

Statistics/BioSci 141, Fall 2006 Lab 2: Probability and Probability Distributions October 13, 2006

TOPIC: PROBABILITY DISTRIBUTIONS

Statistics 6 th Edition

Lecture 2. Probability Distributions Theophanis Tsandilas

Homework: Due Wed, Nov 3 rd Chapter 8, # 48a, 55c and 56 (count as 1), 67a

continuous rv Note for a legitimate pdf, we have f (x) 0 and f (x)dx = 1. For a continuous rv, P(X = c) = c f (x)dx = 0, hence

Probability Theory. Probability and Statistics for Data Science CSE594 - Spring 2016

5.2 Random Variables, Probability Histograms and Probability Distributions

It is common in the field of mathematics, for example, geometry, to have theorems or postulates

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.

Back to estimators...

Random Variables Handout. Xavier Vilà

Lecture III. 1. common parametric models 2. model fitting 2a. moment matching 2b. maximum likelihood 3. hypothesis testing 3a. p-values 3b.

Homework: Due Wed, Feb 20 th. Chapter 8, # 60a + 62a (count together as 1), 74, 82

Statistics and Probability

Chapter 5: Probability

6. Genetics examples: Hardy-Weinberg Equilibrium

4 Random Variables and Distributions

6. Continous Distributions

Chapter 6: Random Variables and Probability Distributions

Commonly Used Distributions

Statistics for Managers Using Microsoft Excel 7 th Edition

UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions.

Theoretical Foundations

Chapter 4 and 5 Note Guide: Probability Distributions

Chapter 4: Asymptotic Properties of MLE (Part 3)

Math489/889 Stochastic Processes and Advanced Mathematical Finance Homework 5

What was in the last lecture?

Business Statistics 41000: Probability 3

Learning From Data: MLE. Maximum Likelihood Estimators

Review for Final Exam Spring 2014 Jeremy Orloff and Jonathan Bloom

4-1. Chapter 4. Commonly Used Distributions by The McGraw-Hill Companies, Inc. All rights reserved.

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER

MATH 3200 Exam 3 Dr. Syring

Homework Problems Stat 479

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

Normal distribution Approximating binomial distribution by normal 2.10 Central Limit Theorem

5.1 Personal Probability

2. The sum of all the probabilities in the sample space must add up to 1

Business Statistics 41000: Probability 4

Chapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi

Lecture Stat 302 Introduction to Probability - Slides 15

ECE 340 Probabilistic Methods in Engineering M/W 3-4:15. Lecture 10: Continuous RV Families. Prof. Vince Calhoun

II - Probability. Counting Techniques. three rules of counting. 1multiplication rules. 2permutations. 3combinations

The normal distribution is a theoretical model derived mathematically and not empirically.

Probability and distributions

Chapter 9 & 10. Multiple Choice.

Random Samples. Mathematics 47: Lecture 6. Dan Sloughter. Furman University. March 13, 2006

CS 361: Probability & Statistics

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

Lecture Notes 6. Assume F belongs to a family of distributions, (e.g. F is Normal), indexed by some parameter θ.

Chapter 7: Estimation Sections

ECON 214 Elements of Statistics for Economists 2016/2017

Favorite Distributions

4. Basic distributions with R

Random variables. Contents

2011 Pearson Education, Inc

Simple Random Sample

BIOL The Normal Distribution and the Central Limit Theorem

Data Analysis and Statistical Methods Statistics 651

CSSS/SOC/STAT 321 Case-Based Statistics I. Random Variables & Probability Distributions I: Discrete Distributions

**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:

PROBABILITY AND STATISTICS

Binomial and Normal Distributions

BIOINFORMATICS MSc PROBABILITY AND STATISTICS SPLUS SHEET 1

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

CSE 312 Winter Learning From Data: Maximum Likelihood Estimators (MLE)

Deriving the Black-Scholes Equation and Basic Mathematical Finance

Probability Theory and Simulation Methods. April 9th, Lecture 20: Special distributions

Chapter 8. Variables. Copyright 2004 Brooks/Cole, a division of Thomson Learning, Inc.

Chapter 5. Statistical inference for Parametric Models

Gov 2001: Section 5. I. A Normal Example II. Uncertainty. Gov Spring 2010

Central Limit Theorem, Joint Distributions Spring 2018

Chapter 5: Statistical Inference (in General)

IEOR 165 Lecture 1 Probability Review

Simulation Wrap-up, Statistics COS 323

MVE051/MSG Lecture 7

Conjugate Models. Patrick Lam

4.3 Normal distribution

MA 1125 Lecture 14 - Expected Values. Wednesday, October 4, Objectives: Introduce expected values.

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:

Lecture 3: Probability Distributions (cont d)

Continuous random variables

The Bernoulli distribution

Introduction to Business Statistics QM 120 Chapter 6

MA : Introductory Probability

Chapter 7: Point Estimation and Sampling Distributions

Chapter 7: Random Variables

Probability & Sampling The Practice of Statistics 4e Mostly Chpts 5 7

STA258H5. Al Nosedal and Alison Weir. Winter Al Nosedal and Alison Weir STA258H5 Winter / 41

Statistical Methods in Practice STAT/MATH 3379

But suppose we want to find a particular value for y, at which the probability is, say, 0.90? In other words, we want to figure out the following:

The Weibull in R is actually parameterized a fair bit differently from the book. In R, the density for x > 0 is

MA 1125 Lecture 12 - Mean and Standard Deviation for the Binomial Distribution. Objectives: Mean and standard deviation for the binomial distribution.

The Normal Distribution

Transcription:

Intro to Likelihood Gov 2001 Section February 2, 2012 Gov 2001 Section () Intro to Likelihood February 2, 2012 1 / 44

Outline 1 Replication Paper 2 An R Note on the Homework 3 Probability Distributions Discrete Distributions Continuous Distributions 4 Basic Likelihood 5 Transforming Distributions Gov 2001 Section () Intro to Likelihood February 2, 2012 2 / 44

Replication Paper Replication Paper Read How to Write a Publishable Paper on Gary s website and Publication, Publication. Find a partner. Find a set of papers you would be interested in replicating. 1 Recently published (in the last two years). 2 From a good journal. 3 Use methods at least as sophisticated as in this class. E-mail us (Gary, Jen, and Molly) to get our opinion. Find the data. Gov 2001 Section () Intro to Likelihood February 2, 2012 3 / 44

Outline An R Note on the Homework 1 Replication Paper 2 An R Note on the Homework 3 Probability Distributions Discrete Distributions Continuous Distributions 4 Basic Likelihood 5 Transforming Distributions Gov 2001 Section () Intro to Likelihood February 2, 2012 4 / 44

An R Note on the Homework An R Note on the Homework How would we find the expected value of a distribution analytically in R? For example, Y Normal(µ, σ 2 ), where µ = 6, σ 2 = 3. In math, we want to integrate Plugging in for µ and σ 1 x 2πσ 2 (x µ)2 e 2σ 2 dx 1 x e (x 6)2 2 3 dx 2 3π Gov 2001 Section () Intro to Likelihood February 2, 2012 5 / 44

An R Note on the Homework An R Note on the Homework cont 1 First, we would write a function of what we want to integrate out: ex.normal <- function(x){ x*1/(sqrt(6*pi))*exp(-(x-6)^2/6) } 2 Use integrate to get the expected value. integrate(ex.normal, lower=-inf, upper=inf) 6 with absolute error < 0.00016 Gov 2001 Section () Intro to Likelihood February 2, 2012 6 / 44

Outline Probability Distributions 1 Replication Paper 2 An R Note on the Homework 3 Probability Distributions Discrete Distributions Continuous Distributions 4 Basic Likelihood 5 Transforming Distributions Gov 2001 Section () Intro to Likelihood February 2, 2012 7 / 44

Probability Distributions Why become familiar with probability distributions? You can fit models to a variety of data. What do you have to do to use probability distributions? You have to recognize what data you are working with. What s the best way to learn the distributions? Learn the stories behind them. Gov 2001 Section () Intro to Likelihood February 2, 2012 8 / 44

Outline Probability Distributions Discrete Distributions 1 Replication Paper 2 An R Note on the Homework 3 Probability Distributions Discrete Distributions Continuous Distributions 4 Basic Likelihood 5 Transforming Distributions Gov 2001 Section () Intro to Likelihood February 2, 2012 9 / 44

Probability Distributions The Bernoulli Distribution Discrete Distributions Takes value 1 with success probability π and value 0 with failure probability 1 π. Ideal for modelling one-time yes/no (or success/failure) events. The best example is one coin flip if your data resemble a single coin flip, then you have a Bernoulli distribution. ex) one voter voting yes/no ex) one person being either a man/woman ex) the Patriots winning/losing the Super Bowl Gov 2001 Section () Intro to Likelihood February 2, 2012 10 / 44

Probability Distributions The Bernoulli Distribution Discrete Distributions Y Bernoulli(π) y = 0, 1 probability of success: π [0, 1] p(y π) = π y (1 π) (1 y) E(Y ) = π Var(Y ) = π(1 π) Gov 2001 Section () Intro to Likelihood February 2, 2012 11 / 44

Probability Distributions Discrete Distributions The Binomial Distribution The Binomial distribution is the total of a bunch of Bernoulli trials. You flip a coin three times and count the total number of heads you got. (The order doesn t matter.) The number of women in a group of 10 Harvard students The number of rainy days in the seven week Gov 2001 Section () Intro to Likelihood February 2, 2012 12 / 44

Probability Distributions The Binomial Distribution Discrete Distributions Y Binomial(n, π) Histogram of Binomial(20,.3) y = 0, 1,..., n Frequency 0 50 100 150 number of trials: n {1, 2,... } probability of success: π [0, 1] p(y π) = ( ) n y π y (1 π) (n y) E(Y ) = nπ 2 4 6 8 10 12 Y Var(Y ) = nπ(1 π) Gov 2001 Section () Intro to Likelihood February 2, 2012 13 / 44

Probability Distributions Discrete Distributions The Multinomial Distribution Suppose you had more than just two outcomes e.g., vote for Republican, Democrat, or Independent. Can you use a binomial? We can t use a binomial, because a binomial requires two outcomes(yes/no, 1/0, etc.). Instead, we use the multinomial. Multinomial lets you work with several mutually exclusive outcomes. For example: you toss a die 15 times and get outcomes 1-6 ten undergraduate students are classified freshmen, sophomores, juniors, or seniors Gov graduate students divided into either American, Comparative, Theory, or IR Gov 2001 Section () Intro to Likelihood February 2, 2012 14 / 44

Probability Distributions Discrete Distributions The Multinomial Distribution Y Multinomial(n, π 1,..., π k ) y j = 0, 1,..., n; k j=1 y j = n number of trials: n {1, 2,... } probability of success for j: π j [0, 1]; k j=1 π j = 1 n! p(y n, π) = y 1!y 2!...y k! πy 1 1 πy 2 2... πy k k E(Y j ) = nπ j Var(Y j ) = nπ j (1 π j ) Gov 2001 Section () Intro to Likelihood February 2, 2012 15 / 44

Probability Distributions Discrete Distributions The Poisson Distribution Represents the number of events occurring in a fixed period of time. Can also be used for the number of events in other specified intervals such as distance, area, or volume. Can never be negative so, good for modeling events. For example: The number Prussian solders who died each year by being kicked in the head by a horse (Bortkiewicz, 1898) The of number shark attacks in Australia per month The number of search warrant requests a federal judge hears in one year Gov 2001 Section () Intro to Likelihood February 2, 2012 16 / 44

Probability Distributions The Poisson Distribution Discrete Distributions Y Poisson(λ) Histogram of Poisson(5) y = 0, 1,... Frequency 0 500 1000 1500 expected number of occurrences: λ > 0 p(y λ) = e λ λ y y! E(Y ) = λ 0 2 4 6 8 10 12 14 Y Var(Y ) = λ Gov 2001 Section () Intro to Likelihood February 2, 2012 17 / 44

Outline Probability Distributions Continuous Distributions 1 Replication Paper 2 An R Note on the Homework 3 Probability Distributions Discrete Distributions Continuous Distributions 4 Basic Likelihood 5 Transforming Distributions Gov 2001 Section () Intro to Likelihood February 2, 2012 18 / 44

Probability Distributions Continuous Distributions The Univariate Normal Distribution Describes data that cluster in a bell curve around the mean. A lot of naturally occurring processes are normally distributed. For example: the weights of male students in our class high school students SAT scores Gov 2001 Section () Intro to Likelihood February 2, 2012 19 / 44

Probability Distributions Continuous Distributions The Univariate Normal Distribution Y Normal(µ, σ 2 ) dnorm(x, 0, 1) 0.0 0.1 0.2 0.3 0.4 Normal Density y R mean: µ R variance: σ 2 > 0 p(y µ, σ 2 ) = E(Y ) = µ ( ) exp (y µ)2 2σ 2 σ 2π 3 2 1 0 1 2 3 Y Var(Y ) = σ 2 Gov 2001 Section () Intro to Likelihood February 2, 2012 20 / 44

Probability Distributions The Uniform Distribution Continuous Distributions Any number in the interval you chose is equally probable. Intuitively easy to understand, but hard to come up with examples. (Easier to think of discrete uniform examples.) For example: the numbers that come out of random number generators the number of a person who comes in first in a races (discrete) the lottery tumblers out of which a person draws one ball with a number on it (also discrete) Gov 2001 Section () Intro to Likelihood February 2, 2012 21 / 44

Probability Distributions The Uniform Distribution Continuous Distributions Y Uniform(α, β) Uniform Density dunif(x, 0, 1) 0.0 0.5 1.0 1.5 y [α, β] Interval: [α, β]; β > α p(y α, β) = 1 β α E(Y ) = α+β 2 0.0 0.2 0.4 0.6 0.8 1.0 Y Var(Y ) = (β α)2 12 Gov 2001 Section () Intro to Likelihood February 2, 2012 22 / 44

Probability Distributions Continuous Distributions Quiz: Test Your Knowledge of Discrete Distributions Are the following Bernoulli (coin flip), Binomial(several coin flips), Multinomial (Rep, Dem, Indep), Poisson (Prussian soldier deaths), Normal (SAT scores), or Uniform (race numbers)? The heights of trees on campus? The number of airplane crashes in one year? A yes or no vote cast by Senator Brown? The number of parking tickets Cambridge PD gives out in one month? The poll your Facebook friends took to choose their favorite sport out of football, basketball, and soccer The time until a country adopts a treaty? Gov 2001 Section () Intro to Likelihood February 2, 2012 23 / 44

Outline Basic Likelihood 1 Replication Paper 2 An R Note on the Homework 3 Probability Distributions Discrete Distributions Continuous Distributions 4 Basic Likelihood 5 Transforming Distributions Gov 2001 Section () Intro to Likelihood February 2, 2012 24 / 44

Basic Likelihood Likelihood The whole point of likelihood is to leverage information about the data generating process into our inferences. Here are the basic steps: Think about your data generating process. (What do the data look like? Use your substantive knowledge.) Find a distribution that you think explains the data. (Poisson, Binomial, Normal? Something else?) Derive the likelihood. Maximize the likelihood to get the MLE. Note: This is the case in the univariate context. We ll be introducing covariates later on in the term. Gov 2001 Section () Intro to Likelihood February 2, 2012 25 / 44

Likelihood: An Example Basic Likelihood Ex. Waiting for the Redline How long will it take for the next T to get here? Gov 2001 Section () Intro to Likelihood February 2, 2012 26 / 44

Basic Likelihood Likelihood: Waiting for the Redline Exponential Distribution f(y) 0.00 0.05 0.10 0.15 0.20 0.25 0 5 10 15 y Y is a Exponential random variable with parameter λ =.25. f (y) = λe λy =.25e.25y Gov 2001 Section () Intro to Likelihood February 2, 2012 27 / 44

Basic Likelihood Likelihood: Waiting for the Redline Last week we assumed λ to get the probability of waiting for the redline for X mins. λ =.25 data. p(y λ =.25) =.25e.25y p(2 < y < 10 λ) =.525 This week we will observe the data to get the probability of λ. data λ. p(λ y) =? f(x) 0.04 0.05 0.06 0.07 0.08 0.09 0.10 f(x) 0.0 0.2 0.4 0.6 0.8 1.0 f(x) 0.0 0.5 1.0 1.5 2.0 0 2 4 6 8 10 y 0 2 4 6 8 10 y 0 2 4 6 8 10 y Gov 2001 Section () Intro to Likelihood February 2, 2012 28 / 44

Basic Likelihood Likelihood: Waiting for the Redline From Bayes Rule: Let p(λ y) = p(y λ)p(λ) p(y) k(y) = p(λ) p(y) (Note that the λ in k(y) is the true λ, a constant that doesn t vary. So k(y) is just a function of y.) Define L(λ y) = p(y λ)k(y) L(λ y) p(y λ) Gov 2001 Section () Intro to Likelihood February 2, 2012 29 / 44

Basic Likelihood Monday Data L(λ y 1 ) p(y 1 λ) = λe λ y 1 = λe λ 12 Gov 2001 Section () Intro to Likelihood February 2, 2012 30 / 44

Basic Likelihood Plotting the likelihood First, note that we can take advantage of a lot of pre-packaged R functions rbinom, rpoisson, rnorm, runif gives random values from that distribution pbinom, ppoisson, pnorm, punif gives the cumulative distribution (the probability of that value or less) dbinom, dpoisson, dnorm, dunif gives the density (i.e., height of the PDF useful for drawing) qbinom, qpoisson, qnorm, qunif gives the quantile function (given quantile, tells you the value) Gov 2001 Section () Intro to Likelihood February 2, 2012 31 / 44

Plotting the example Basic Likelihood We want to plot L(λ y) λe λ 12 dexp(x, rate, log=false) e.g. dexp(12,.25) [1] 0.01244677 curve(dexp(12, rate = x), xlim =c(0,1), xlab ="lambda", ylab = "likelihood") Gov 2001 Section () Intro to Likelihood February 2, 2012 32 / 44

Plotting the example Basic Likelihood likelihood 0.000 0.005 0.010 0.015 0.020 0.025 0.030 0.0 0.2 0.4 0.6 0.8 1.0 lambda What do you think the maximum likelihood estimate will be? Gov 2001 Section () Intro to Likelihood February 2, 2012 33 / 44

Basic Likelihood Solving Using R 1 Write a function. expon <- function(lambda,data) { -lambda*exp(-lambda*data) } 2 Optimize. optimize(f=expon, data=12, lower=0, upper=100) 3 Output $minimum [1] 0.0833248 $objective [1] -0.03065662 Gov 2001 Section () Intro to Likelihood February 2, 2012 34 / 44

Basic Likelihood Where are we going with this? What if we have two or more data points that we believe come from the same model? We can derive a likelihood for the combined data by multiplying the independent likelihoods together. Gov 2001 Section () Intro to Likelihood February 2, 2012 35 / 44

Basic Likelihood Tuesday Data L(λ y 2 ) p(y 2 λ) = λe λ y 2 = λe λ 7 Gov 2001 Section () Intro to Likelihood February 2, 2012 36 / 44

Basic Likelihood Likelihood for Monday and Tuesday Remember that for independent events: P(A, B) = P(A)P(B) L(λ y 1, y 2 ) = λe λ y 1 λe λ y 2 = λe λ 12 λe λ 7 Gov 2001 Section () Intro to Likelihood February 2, 2012 37 / 44

Basic Likelihood A Whole Week of Data L(λ y1... y5 ) = 5 Y λe λ yi i=1 = λe λ y1 λe λ y2 λe λ y3 λe λ y4 λe λ y5 = λe λ 12 λe λ 7 λe λ 4 λe λ 19 λe λ 2 Gov 2001 Section () Intro to Likelihood February 2, 2012 38 / 44

Outline Transforming Distributions 1 Replication Paper 2 An R Note on the Homework 3 Probability Distributions Discrete Distributions Continuous Distributions 4 Basic Likelihood 5 Transforming Distributions Gov 2001 Section () Intro to Likelihood February 2, 2012 39 / 44

Transforming Distributions Transforming Distributions X p(x θ) y = g(x) How is y distributed? For example, if X Exponential(λ = 1) and y = log(x) y? f(x) 0.0 0.2 0.4 0.6 0.8 1.0 0 2 4 6 8 10 Gov 2001 Section () y Intro to Likelihood February 2, 2012 40 / 44

Transforming Distributions Transforming Distributions It is NOT true that p(y θ) g(p(x θ)). Why? p(x theta) -10-8 -6-4 -2 0 Density 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0 2 4 6 8 10 x -8-6 -4-2 0 2 y Gov 2001 Section () Intro to Likelihood February 2, 2012 41 / 44

Transforming Distributions Transforming Distributions The Rule X p x (x θ) y = g(x) p y (y) = p x (g 1 (y)) dg 1 dy What is g 1 (y)? What is? The Jacobian. dg 1 dy Gov 2001 Section () Intro to Likelihood February 2, 2012 42 / 44

Transforming Distributions Transforming Distributions the log-normal Example For example, X Normal(x µ = 0, σ = 1) y = g(x) = e x what is g 1 (y)? g 1 (y) = x = log(y) What is dg 1 dy? d(log(y)) dy = 1 y Gov 2001 Section () Intro to Likelihood February 2, 2012 43 / 44

Transforming Distributions Transforming Distributions the log-normal Example Put it all together p y (y) = p x (log(y)) 1 y Notice we don t need the absolute value because y > 0. p y (y) = 1 2π e 1 2 (log(y))2 1 y Y log-normal(0, 1) Challenge: derive the chi-squared distribution. Gov 2001 Section () Intro to Likelihood February 2, 2012 44 / 44