START HERE: Instructions. 1 Exponential Family [Zhou, Manzil]

Size: px
Start display at page:

Download "START HERE: Instructions. 1 Exponential Family [Zhou, Manzil]"

Transcription

1 START HERE: Instructions Thanks a lot to John A.W.B. Constanzo and Shi Zong for providing and allowing to use the latex source files for quick preparation of the HW solution. The homework was due at 9:00am on Feb 3, 05. Anything that is received after that time will not be considered. s to every theory questions will be also submitted electronically on Autolab PDF: Latex or handwritten and scanned. Make sure you prepare the answers to each question separately. Collaboration on solving the homework is allowed after you have thought about the problems on your own. However, when you do collaborate, you should list your collaborators! You might also have gotten some inspiration from resources books or online etc... This might be OK only after you have tried to solve the problem, and couldn t. In such a case, you should cite your resources. If you do collaborate with someone or use a book or website, you are expected to write up your solution independently. That is, close the book and all of your notes before starting to write up your solution. Latex source of this homework: hw5_latex.tar Exponential Family [Zhou, Manzil] In this problem we will review the exponential family, its significance in Bayesian statistics and work out a detailed example for the commonly encountered Multivariate Normal distribution and its conjugate prior Normal Inverse Wishart Distribution.. Review Exponential family is a set of probability distributions whose probability density function for x R d can be expressed in the form: px θ = exp φx, θ T gθ where φx is a sufficient statistic of the distribution. For exponential families, the sufficient statistic is a function of the data that fully summarizes the data x within the density function. The sufficient statistic of a set of independent identically distributed data observations is simply the sum of individual sufficient statistics, and encapsulates all the information needed to describe the posterior distribution of the parameters, given the data and hence to derive any desired estimate of the parameters. We will explore this important property in detail below. θ is called the natural parameter. The set of values of θ for which the function px θ is finite is called the natural parameter space. It can be shown that the natural parameter space is always convex. First show that log-partition function gθ is a convex function, then you can show this from first principles. gθ is called the log-partition function because it is the logarithm of a normalization factor, without which px θ would not be a probability distribution partition function is often used as a synonym of normalization factor for historical reasons arising from Statistical Physics.

2 . Conjugate Priors [0+0+0] Exponential families are very important in Bayesian statistics. In Bayesian statistics a prior distribution is multiplied by a likelihood function and then normalised to produce a posterior distribution. In the case of a likelihood which belongs to the exponential family there always exists a conjugate prior, which is also in the exponential family. Consider the distribution: pθ; m 0, φ 0 = exp φ 0, θ m 0, gθ hm 0, φ 0 where m 0 > 0 and φ 0 R d. These are called hyperparameters parameters controlling parameters. Question Show that this distribution, i.e. is a member of the Exponential Family. There is not much to show. Note pθ; m 0, φ 0 = exp θ The sufficient statistic is gθ φ0 The natural parameter is m 0 θ φ0, hm gθ m 0, φ The log-partition function is hm 0, φ 0. There exist infinitely many splitting into ĝ for which h = T ĝ

3 Suppose we obtain the data X = x,..., x n, where x i p θ, i.e. each single observation follows some distribution from the exponential family. Question First of all write out the likelihood px θ. Then use as the prior and derive the posterior pθ X exactly, i.e. with proper normalization constant. The likelihood turns out to be simply n n px θ = px i θ = exp φx i, θ gθ i= i= n = exp φx i, θ i= gθ i= n = exp φx i, θ n T gθ i= 4 Now observe that h is defined so that for all x, y in the hyperparameter space, x θ exp, hy, x dθ =. 5 y gθ Keeping this in mind, we proceed to compute the posterior as pθ X px θpθ; m 0, φ 0 n exp φx i, θ n T gθ + i= φ0 m 0 φ0 + n = exp i= φx i θ,. m 0 + n gθ θ, gθ 6 By 5, normalizing yields [ φ0 + n pθ X = exp i= φx i θ, h m m 0 + n gθ 0 + n, φ 0 + ] φx i. 7 i= 3

4 If you got Question correct hopefully you did, observe that the posterior has the same form as the prior, thus is a conjugate prior. The difference between the prior, i.e. and your answer to Question lies only in the parameters. Question 3 Let m n and φ n be parameters of the posterior pθ X, then show that: m n = m 0 + n φ n = φ 0 + φx i i= 8 We call this update equations. This is obvious from 7. Specifically, comparing equation 7 with prior distribution, we can see that posses the same form and having only the following difference in parameters: m n = m 0 + n φ n = φ 0 + φx i i= 9 This shows that the update equations can be written simply in terms of the number of data points and the sufficient statistic of the data. Also, it provides meaning to the hyperparameters. In particular, m 0 corresponds to the effective number of fake observations that the prior distribution contributes, and φ 0 corresponds to the total amount that these fake observations contribute to the sufficient statistic over all observations and fake observations. 4

5 .3 Multivariate Normal Distribution [0+0+0] The Multivariate Normal N µ, Σ is a distribution that is encountered very often. The distribution is given by: px µ, Σ = πd Σ exp x µt Σ x µ 0 where µ R d and Σ 0 is a symmetric positive definite d d matrix. We claim that it belongs to the Exponential Family. Question 4 Identify the natural parameters θ in terms of µ and Σ. Also derive the sufficient statistics φx and log partition function gθ in terms of µ and Σ. Hint: Design a two dimensional gθ, where first dimension is µt Σ µ. We will use the notation A, B = veca, vecb = trab when A and B are matrices. Note that x T Σ x = trx T Σ x Now px µ, Σ = exp = exp = trxx T Σ = xx T, Σ. x T Σ x x T Σ µ + µ T Σ µ + log[π d Σ ] [ µ xx T, Σ x, Σ µ + T T Σ µ log[π d Σ ] [ x Σ = exp xx T, µ T Σ ] µt Σ µ. log[πd Σ ] The natural parameters, sufficient statistics and log partition function are obtained by inspection as: So we get to know that the natural parameter is, Σ µ Σ ] The sufficient statistics is, The log-partition function is, x xx T µt Σ µ d logπ + log Σ 5

6 The conjugate prior for Multivariate Normal Distribution can be parametrized as the Normal Inverse Wishart Distribution N IWµ 0, κ 0, Σ 0, ν 0. The distribution is given by: pµ, Σ; µ 0, κ 0, Σ 0, ν 0 = N µ µ 0, Σ/κ 0 W Σ Σ 0, ν 0 = κ d 0 Σ 0 ν 0 Σ ν0+d+ ν 0 +d π d Γ d ν0 e κ 0 µ µ 0 T Σ µ µ 0 trσ0σ 3 where κ 0, ν 0 > 0, µ 0 R d and Σ 0 0 is a symmetric positive definite d d matrix., and Γ d is the multivariate gamma function. Question 5 Notice that Normal Inverse Wishart Distribution will fit into the form of. Find the mapping between µ 0, κ 0, Σ 0, ν 0 and m 0, φ 0 and the function hm 0, φ 0 in terms of µ 0, κ 0, Σ 0, ν 0. Hint: m 0 and gθ is two dimensional. A bit of algebra shows that: κ0 pµ, Σ = exp µt Σ µ + κ 0 µ T 0 Σ µ κ 0 µt 0 Σ µ 0 trσ 0Σ + = exp d log κ 0 + ν 0 log Σ 0 ν 0 + d + κ 0 µ 0 κ 0 µ 0 µ T 0 + Σ 0 ν 0 + d + Σ, µ Σ log[π d ] v 0 + log Σ ν 0 + d log d log π log Γ d κ 0, µt Σ µ d ν 0 + d + logπ + + log Σ ] ν0 log d d log π + d log κ 0 + ν 0 log Σ ν0 0 log Γ d Comparing terms with, we obtain κ m 0 = 0 ν 0 + d + κ φ 0 = 0 µ 0 Σ 0 + κ 0 µ 0 µ T 0 dd + hm 0, φ 0 = logπ ν 0d logπ d logκ 0 ν 0 log Σ ν0 0 + log Γ d. 4 6

7 Equipped with these, results we move on to tackle the problem of finding posterior for µ, Σ. One can follow brute force approach to find it be using 0 and 3, but things can get really messy. We will adopt a more elegant and easy approach exploiting the fact that these distribution belong to the exponential family. Question 6 Using the update equations described in Question 3 and your answers to Question 4 and 5, directly write down the posterior for pµ, Σ X. Just providing appropriate update equations would suffice. Applying update equations 8 in Question 3, first considering m n κ m n = n κ = 0 + n 5 ν n + d + ν 0 + d + Similarly for φ n we have κ φ n = n µ n Σ n + κ n µ n µ T = n κ 0 µ 0 Σ 0 + κ 0 µ 0 µ T 0 n + i= x i n i= x ix T i 6 Thus, we obtain the following update equations in terms of µ 0, κ 0, Σ 0, ν 0 : κ n = κ 0 + n µ n = κ 0µ 0 + n x κ 0 + n ν n = ν 0 + n Σ n = Σ 0 + x i x T i i= + κ 0 µ 0 µ T 0 κ n µ n µ T n 7 where x = n n i= x i. This is one of the remarkable cases where working out the general case saves you effort than working out the special case! The algebra can become very complicated, e.g. see murphyk/papers/bayesgauss.pdf where they have explicitly done complicated math! We hope that after solving this homework, you can take advantage of this neat short-cut : 7

8 If we want the same expression as Wikipedia for Σ n, we need to the following rearrangements: Σ n = Σ 0 + x i x T i + κ 0 µ 0 µ T 0 κ n µ n µ T n i= = Σ 0 + C + x i x T + i= xx T i i= x x T + κ 0 µ 0 µ T 0 κ n µ n µ T n i= = Σ 0 + C + n x x T + κ 0 µ 0 µ T 0 κ n µ n µ T n where C = n i= x i xx i x T. Now further expanding µ n, we obtain: 8 Σ n = Σ 0 + C + n x x T + κ 0 µ 0 µ T 0 κ n µ n µ T n κ 0 µ 0 + n xκ 0 µ 0 + n x T = Σ 0 + C + κ 0 µ 0 µ T 0 + +n x x T κ 0 + n nκ0 = Σ 0 + C + µ 0 µ T 0 nκ 0 µ 0 x T nκ 0 xµ T 0 + nκ 0 x x T κ 0 + n = Σ 0 + C + nκ0 κ 0 + n x µ 0 x µ 0 T 9 8

9 .4 Posterior Predictive - Bonus [0+0] Another quantify, which is often of interest in Bayesian Statistics, is the posterior predictive. The posterior predictive distribution is the distribution of unobserved observations prediction conditional on the observed data. Specifically, it is computed by marginalising over the parameters, using the posterior distribution: p x X = p x θ pθ X dθ 0 The posterior predictive distribution for a distribution in exponential family has a rather nice form. Question 7 Show that the posterior predictive for the distribution with prior is given by: p x X = exp {h m n +, φ n + φ x h m n, φ n } Since p x X = exp [ φ x, θ T gθ + φ n, θ m n, gθ hm n, φ n ] dθ = exp [ φ x + φ n, θ m n +, gθ hm n, φ n ] dθ = exp [ hm n, φ n ] exp [ φ x + φ n, θ m n +, gθ ] dθ. exp [ φ x + φ n, θ m n +, gθ hm n +, φ n + φ x] = p θ; m n +, φ n + φ x 3 is a probability distribution, we have = exp [ φ x + φ n, θ m n +, gθ h m n +, φ n + φ x] dθ = exp [ h m n +, φ n + φ x] exp [ φ x + φ n, θ m n +, gθ ] dθ, 4 leading to exp [ φ x + φ n, θ m n +, gθ ] dθ = exp [h m n +, φ n + φ x]. 5 Combining this result with yields the desired result: p x X = exp {h m n +, φ n + φ x h m n, φ n } 6 9

10 The result of previous problem can be specialized for the Multivariate Normal case. Question 8 Find the predictive posterior for the case of Multivariate Normal Distribution with Normal Inverse Wishart Distribution, having parameters as described in.3 by using Question 7. Hint: The matrix determinant lemma might come handy determinant_lemma. The adding of new point x leads to following update: κ = κ n + ν = ν n + µ = κ nµ n + x κ n + Σ = Σ n + x x T + κ n µ n µ T n κ µ µ T = Σ n + Now by substituting 4 into, exphm n +, φ n + φ x exphm n, φ n κ n κ n + x µ n x µ n T dd+ = π π νn+d κ n + d Σ νn+ Γ νn+ d π dd+ π νnd κ n d Σ n νn Γ νn d = Γ ν n+ Γ ν n d+ d π d + κ n Σ n νn Σ νn+ Next rewrite Σ using matrix determinant lemma, Σ = Σ n + κ n κ n + x µ n x µ n T [ = + κ ] n κ n + x µ n T Σ n x µ n Σ n Putting all together, we get Γ ν n+ p x X = Γ d/. 30 νn+/ ν n d+ π d/ κ n+ κ Σn n / + κn κ x µ n+ n T Σ n x µ n Further using student t distribution formula, one can show that p x X = x µ n, t νn d+ κ n+σ n κ nν n d+ 0

Posterior Inference. , where should we start? Consider the following computational procedure: 1. draw samples. 2. convert. 3. compute properties

Posterior Inference. , where should we start? Consider the following computational procedure: 1. draw samples. 2. convert. 3. compute properties Posterior Inference Example. Consider a binomial model where we have a posterior distribution for the probability term, θ. Suppose we want to make inferences about the log-odds γ = log ( θ 1 θ), where

More information

Bayesian Linear Model: Gory Details

Bayesian Linear Model: Gory Details Bayesian Linear Model: Gory Details Pubh7440 Notes By Sudipto Banerjee Let y y i ] n i be an n vector of independent observations on a dependent variable (or response) from n experimental units. Associated

More information

Non-informative Priors Multiparameter Models

Non-informative Priors Multiparameter Models Non-informative Priors Multiparameter Models Statistics 220 Spring 2005 Copyright c 2005 by Mark E. Irwin Prior Types Informative vs Non-informative There has been a desire for a prior distributions that

More information

CS340 Machine learning Bayesian statistics 3

CS340 Machine learning Bayesian statistics 3 CS340 Machine learning Bayesian statistics 3 1 Outline Conjugate analysis of µ and σ 2 Bayesian model selection Summarizing the posterior 2 Unknown mean and precision The likelihood function is p(d µ,λ)

More information

Chapter 7: Estimation Sections

Chapter 7: Estimation Sections 1 / 40 Chapter 7: Estimation Sections 7.1 Statistical Inference Bayesian Methods: Chapter 7 7.2 Prior and Posterior Distributions 7.3 Conjugate Prior Distributions 7.4 Bayes Estimators Frequentist Methods:

More information

Chapter 7: Estimation Sections

Chapter 7: Estimation Sections 1 / 31 : Estimation Sections 7.1 Statistical Inference Bayesian Methods: 7.2 Prior and Posterior Distributions 7.3 Conjugate Prior Distributions 7.4 Bayes Estimators Frequentist Methods: 7.5 Maximum Likelihood

More information

ST440/550: Applied Bayesian Analysis. (5) Multi-parameter models - Summarizing the posterior

ST440/550: Applied Bayesian Analysis. (5) Multi-parameter models - Summarizing the posterior (5) Multi-parameter models - Summarizing the posterior Models with more than one parameter Thus far we have studied single-parameter models, but most analyses have several parameters For example, consider

More information

Likelihood Methods of Inference. Toss coin 6 times and get Heads twice.

Likelihood Methods of Inference. Toss coin 6 times and get Heads twice. Methods of Inference Toss coin 6 times and get Heads twice. p is probability of getting H. Probability of getting exactly 2 heads is 15p 2 (1 p) 4 This function of p, is likelihood function. Definition:

More information

Chapter 8: Sampling distributions of estimators Sections

Chapter 8: Sampling distributions of estimators Sections Chapter 8 continued Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square distributions 8.3 Joint Distribution of the sample mean and sample

More information

Chapter 4: Asymptotic Properties of MLE (Part 3)

Chapter 4: Asymptotic Properties of MLE (Part 3) Chapter 4: Asymptotic Properties of MLE (Part 3) Daniel O. Scharfstein 09/30/13 1 / 1 Breakdown of Assumptions Non-Existence of the MLE Multiple Solutions to Maximization Problem Multiple Solutions to

More information

Conjugate priors: Beta and normal Class 15, Jeremy Orloff and Jonathan Bloom

Conjugate priors: Beta and normal Class 15, Jeremy Orloff and Jonathan Bloom 1 Learning Goals Conjugate s: Beta and normal Class 15, 18.05 Jeremy Orloff and Jonathan Bloom 1. Understand the benefits of conjugate s.. Be able to update a beta given a Bernoulli, binomial, or geometric

More information

12 The Bootstrap and why it works

12 The Bootstrap and why it works 12 he Bootstrap and why it works For a review of many applications of bootstrap see Efron and ibshirani (1994). For the theory behind the bootstrap see the books by Hall (1992), van der Waart (2000), Lahiri

More information

Lecture 17. The model is parametrized by the time period, δt, and three fixed constant parameters, v, σ and the riskless rate r.

Lecture 17. The model is parametrized by the time period, δt, and three fixed constant parameters, v, σ and the riskless rate r. Lecture 7 Overture to continuous models Before rigorously deriving the acclaimed Black-Scholes pricing formula for the value of a European option, we developed a substantial body of material, in continuous

More information

Bayesian Hierarchical/ Multilevel and Latent-Variable (Random-Effects) Modeling

Bayesian Hierarchical/ Multilevel and Latent-Variable (Random-Effects) Modeling Bayesian Hierarchical/ Multilevel and Latent-Variable (Random-Effects) Modeling 1: Formulation of Bayesian models and fitting them with MCMC in WinBUGS David Draper Department of Applied Mathematics and

More information

STAT 425: Introduction to Bayesian Analysis

STAT 425: Introduction to Bayesian Analysis STAT 45: Introduction to Bayesian Analysis Marina Vannucci Rice University, USA Fall 018 Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall 018 1 / 37 Lectures 9-11: Multi-parameter

More information

Probability. An intro for calculus students P= Figure 1: A normal integral

Probability. An intro for calculus students P= Figure 1: A normal integral Probability An intro for calculus students.8.6.4.2 P=.87 2 3 4 Figure : A normal integral Suppose we flip a coin 2 times; what is the probability that we get more than 2 heads? Suppose we roll a six-sided

More information

(5) Multi-parameter models - Summarizing the posterior

(5) Multi-parameter models - Summarizing the posterior (5) Multi-parameter models - Summarizing the posterior Spring, 2017 Models with more than one parameter Thus far we have studied single-parameter models, but most analyses have several parameters For example,

More information

Extend the ideas of Kan and Zhou paper on Optimal Portfolio Construction under parameter uncertainty

Extend the ideas of Kan and Zhou paper on Optimal Portfolio Construction under parameter uncertainty Extend the ideas of Kan and Zhou paper on Optimal Portfolio Construction under parameter uncertainty George Photiou Lincoln College University of Oxford A dissertation submitted in partial fulfilment for

More information

Lecture Notes 6. Assume F belongs to a family of distributions, (e.g. F is Normal), indexed by some parameter θ.

Lecture Notes 6. Assume F belongs to a family of distributions, (e.g. F is Normal), indexed by some parameter θ. Sufficient Statistics Lecture Notes 6 Sufficiency Data reduction in terms of a particular statistic can be thought of as a partition of the sample space X. Definition T is sufficient for θ if the conditional

More information

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER Two hours MATH20802 To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER STATISTICAL METHODS Answer any FOUR of the SIX questions.

More information

Problem Set 4 Answers

Problem Set 4 Answers Business 3594 John H. Cochrane Problem Set 4 Answers ) a) In the end, we re looking for ( ) ( ) + This suggests writing the portfolio as an investment in the riskless asset, then investing in the risky

More information

Bayesian Normal Stuff

Bayesian Normal Stuff Bayesian Normal Stuff - Set-up of the basic model of a normally distributed random variable with unknown mean and variance (a two-parameter model). - Discuss philosophies of prior selection - Implementation

More information

The rth moment of a real-valued random variable X with density f(x) is. x r f(x) dx

The rth moment of a real-valued random variable X with density f(x) is. x r f(x) dx 1 Cumulants 1.1 Definition The rth moment of a real-valued random variable X with density f(x) is µ r = E(X r ) = x r f(x) dx for integer r = 0, 1,.... The value is assumed to be finite. Provided that

More information

Module 2: Monte Carlo Methods

Module 2: Monte Carlo Methods Module 2: Monte Carlo Methods Prof. Mike Giles mike.giles@maths.ox.ac.uk Oxford University Mathematical Institute MC Lecture 2 p. 1 Greeks In Monte Carlo applications we don t just want to know the expected

More information

MTH6154 Financial Mathematics I Stochastic Interest Rates

MTH6154 Financial Mathematics I Stochastic Interest Rates MTH6154 Financial Mathematics I Stochastic Interest Rates Contents 4 Stochastic Interest Rates 45 4.1 Fixed Interest Rate Model............................ 45 4.2 Varying Interest Rate Model...........................

More information

MA 1125 Lecture 12 - Mean and Standard Deviation for the Binomial Distribution. Objectives: Mean and standard deviation for the binomial distribution.

MA 1125 Lecture 12 - Mean and Standard Deviation for the Binomial Distribution. Objectives: Mean and standard deviation for the binomial distribution. MA 5 Lecture - Mean and Standard Deviation for the Binomial Distribution Friday, September 9, 07 Objectives: Mean and standard deviation for the binomial distribution.. Mean and Standard Deviation of the

More information

Conjugate Models. Patrick Lam

Conjugate Models. Patrick Lam Conjugate Models Patrick Lam Outline Conjugate Models What is Conjugacy? The Beta-Binomial Model The Normal Model Normal Model with Unknown Mean, Known Variance Normal Model with Known Mean, Unknown Variance

More information

Applications of Exponential Functions Group Activity 7 Business Project Week #10

Applications of Exponential Functions Group Activity 7 Business Project Week #10 Applications of Exponential Functions Group Activity 7 Business Project Week #10 In the last activity we looked at exponential functions. This week we will look at exponential functions as related to interest

More information

Chapter 7: Estimation Sections

Chapter 7: Estimation Sections Chapter 7: Estimation Sections 7.1 Statistical Inference Bayesian Methods: 7.2 Prior and Posterior Distributions 7.3 Conjugate Prior Distributions Frequentist Methods: 7.5 Maximum Likelihood Estimators

More information

CSC 411: Lecture 08: Generative Models for Classification

CSC 411: Lecture 08: Generative Models for Classification CSC 411: Lecture 08: Generative Models for Classification Richard Zemel, Raquel Urtasun and Sanja Fidler University of Toronto Zemel, Urtasun, Fidler (UofT) CSC 411: 08-Generative Models 1 / 23 Today Classification

More information

CS340 Machine learning Bayesian model selection

CS340 Machine learning Bayesian model selection CS340 Machine learning Bayesian model selection Bayesian model selection Suppose we have several models, each with potentially different numbers of parameters. Example: M0 = constant, M1 = straight line,

More information

Back to estimators...

Back to estimators... Back to estimators... So far, we have: Identified estimators for common parameters Discussed the sampling distributions of estimators Introduced ways to judge the goodness of an estimator (bias, MSE, etc.)

More information

IEOR 3106: Introduction to OR: Stochastic Models. Fall 2013, Professor Whitt. Class Lecture Notes: Tuesday, September 10.

IEOR 3106: Introduction to OR: Stochastic Models. Fall 2013, Professor Whitt. Class Lecture Notes: Tuesday, September 10. IEOR 3106: Introduction to OR: Stochastic Models Fall 2013, Professor Whitt Class Lecture Notes: Tuesday, September 10. The Central Limit Theorem and Stock Prices 1. The Central Limit Theorem (CLT See

More information

Martingales. by D. Cox December 2, 2009

Martingales. by D. Cox December 2, 2009 Martingales by D. Cox December 2, 2009 1 Stochastic Processes. Definition 1.1 Let T be an arbitrary index set. A stochastic process indexed by T is a family of random variables (X t : t T) defined on a

More information

Definition 9.1 A point estimate is any function T (X 1,..., X n ) of a random sample. We often write an estimator of the parameter θ as ˆθ.

Definition 9.1 A point estimate is any function T (X 1,..., X n ) of a random sample. We often write an estimator of the parameter θ as ˆθ. 9 Point estimation 9.1 Rationale behind point estimation When sampling from a population described by a pdf f(x θ) or probability function P [X = x θ] knowledge of θ gives knowledge of the entire population.

More information

Extracting Information from the Markets: A Bayesian Approach

Extracting Information from the Markets: A Bayesian Approach Extracting Information from the Markets: A Bayesian Approach Daniel Waggoner The Federal Reserve Bank of Atlanta Florida State University, February 29, 2008 Disclaimer: The views expressed are the author

More information

The Normal Distribution

The Normal Distribution Will Monroe CS 09 The Normal Distribution Lecture Notes # July 9, 207 Based on a chapter by Chris Piech The single most important random variable type is the normal a.k.a. Gaussian) random variable, parametrized

More information

ROM SIMULATION Exact Moment Simulation using Random Orthogonal Matrices

ROM SIMULATION Exact Moment Simulation using Random Orthogonal Matrices ROM SIMULATION Exact Moment Simulation using Random Orthogonal Matrices Bachelier Finance Society Meeting Toronto 2010 Henley Business School at Reading Contact Author : d.ledermann@icmacentre.ac.uk Alexander

More information

Practice Exercises for Midterm Exam ST Statistical Theory - II The ACTUAL exam will consists of less number of problems.

Practice Exercises for Midterm Exam ST Statistical Theory - II The ACTUAL exam will consists of less number of problems. Practice Exercises for Midterm Exam ST 522 - Statistical Theory - II The ACTUAL exam will consists of less number of problems. 1. Suppose X i F ( ) for i = 1,..., n, where F ( ) is a strictly increasing

More information

arxiv: v1 [math.st] 18 Sep 2018

arxiv: v1 [math.st] 18 Sep 2018 Gram Charlier and Edgeworth expansion for sample variance arxiv:809.06668v [math.st] 8 Sep 08 Eric Benhamou,* A.I. SQUARE CONNECT, 35 Boulevard d Inkermann 900 Neuilly sur Seine, France and LAMSADE, Universit

More information

Machine Learning for Quantitative Finance

Machine Learning for Quantitative Finance Machine Learning for Quantitative Finance Fast derivative pricing Sofie Reyners Joint work with Jan De Spiegeleer, Dilip Madan and Wim Schoutens Derivative pricing is time-consuming... Vanilla option pricing

More information

Characterization of the Optimum

Characterization of the Optimum ECO 317 Economics of Uncertainty Fall Term 2009 Notes for lectures 5. Portfolio Allocation with One Riskless, One Risky Asset Characterization of the Optimum Consider a risk-averse, expected-utility-maximizing

More information

STOR Lecture 15. Jointly distributed Random Variables - III

STOR Lecture 15. Jointly distributed Random Variables - III STOR 435.001 Lecture 15 Jointly distributed Random Variables - III Jan Hannig UNC Chapel Hill 1 / 17 Before we dive in Contents of this lecture 1. Conditional pmf/pdf: definition and simple properties.

More information

Lecture 10: Point Estimation

Lecture 10: Point Estimation Lecture 10: Point Estimation MSU-STT-351-Sum-17B (P. Vellaisamy: MSU-STT-351-Sum-17B) Probability & Statistics for Engineers 1 / 31 Basic Concepts of Point Estimation A point estimate of a parameter θ,

More information

EE641 Digital Image Processing II: Purdue University VISE - October 29,

EE641 Digital Image Processing II: Purdue University VISE - October 29, EE64 Digital Image Processing II: Purdue University VISE - October 9, 004 The EM Algorithm. Suffient Statistics and Exponential Distributions Let p(y θ) be a family of density functions parameterized by

More information

5.3 Interval Estimation

5.3 Interval Estimation 5.3 Interval Estimation Ulrich Hoensch Wednesday, March 13, 2013 Confidence Intervals Definition Let θ be an (unknown) population parameter. A confidence interval with confidence level C is an interval

More information

Probability Theory. Probability and Statistics for Data Science CSE594 - Spring 2016

Probability Theory. Probability and Statistics for Data Science CSE594 - Spring 2016 Probability Theory Probability and Statistics for Data Science CSE594 - Spring 2016 What is Probability? 2 What is Probability? Examples outcome of flipping a coin (seminal example) amount of snowfall

More information

STAT 830 Convergence in Distribution

STAT 830 Convergence in Distribution STAT 830 Convergence in Distribution Richard Lockhart Simon Fraser University STAT 830 Fall 2013 Richard Lockhart (Simon Fraser University) STAT 830 Convergence in Distribution STAT 830 Fall 2013 1 / 31

More information

2.1 Mean-variance Analysis: Single-period Model

2.1 Mean-variance Analysis: Single-period Model Chapter Portfolio Selection The theory of option pricing is a theory of deterministic returns: we hedge our option with the underlying to eliminate risk, and our resulting risk-free portfolio then earns

More information

UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions.

UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions. UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions. Random Variables 2 A random variable X is a numerical (integer, real, complex, vector etc.) summary of the outcome of the random experiment.

More information

Generating Random Numbers

Generating Random Numbers Generating Random Numbers Aim: produce random variables for given distribution Inverse Method Let F be the distribution function of an univariate distribution and let F 1 (y) = inf{x F (x) y} (generalized

More information

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise. Math 224 Q Exam 3A Fall 217 Tues Dec 12 Version A Problem 1. Let X be the continuous random variable defined by the following pdf: { 1 x/2 when x 2, f(x) otherwise. (a) Compute the mean µ E[X]. E[X] x

More information

continuous rv Note for a legitimate pdf, we have f (x) 0 and f (x)dx = 1. For a continuous rv, P(X = c) = c f (x)dx = 0, hence

continuous rv Note for a legitimate pdf, we have f (x) 0 and f (x)dx = 1. For a continuous rv, P(X = c) = c f (x)dx = 0, hence continuous rv Let X be a continuous rv. Then a probability distribution or probability density function (pdf) of X is a function f(x) such that for any two numbers a and b with a b, P(a X b) = b a f (x)dx.

More information

Bivariate Birnbaum-Saunders Distribution

Bivariate Birnbaum-Saunders Distribution Department of Mathematics & Statistics Indian Institute of Technology Kanpur January 2nd. 2013 Outline 1 Collaborators 2 3 Birnbaum-Saunders Distribution: Introduction & Properties 4 5 Outline 1 Collaborators

More information

درس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی

درس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی یادگیري ماشین توزیع هاي نمونه و تخمین نقطه اي پارامترها Sampling Distributions and Point Estimation of Parameter (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی درس هفتم 1 Outline Introduction

More information

CSE 312 Winter Learning From Data: Maximum Likelihood Estimators (MLE)

CSE 312 Winter Learning From Data: Maximum Likelihood Estimators (MLE) CSE 312 Winter 2017 Learning From Data: Maximum Likelihood Estimators (MLE) 1 Parameter Estimation Given: independent samples x1, x2,..., xn from a parametric distribution f(x θ) Goal: estimate θ. Not

More information

Course information FN3142 Quantitative finance

Course information FN3142 Quantitative finance Course information 015 16 FN314 Quantitative finance This course is aimed at students interested in obtaining a thorough grounding in market finance and related empirical methods. Prerequisite If taken

More information

Chapter 3 Common Families of Distributions. Definition 3.4.1: A family of pmfs or pdfs is called exponential family if it can be expressed as

Chapter 3 Common Families of Distributions. Definition 3.4.1: A family of pmfs or pdfs is called exponential family if it can be expressed as Lecture 0 on BST 63: Statistical Theory I Kui Zhang, 09/9/008 Review for the previous lecture Definition: Several continuous distributions, including uniform, gamma, normal, Beta, Cauchy, double exponential

More information

Chapter 6. The Normal Probability Distributions

Chapter 6. The Normal Probability Distributions Chapter 6 The Normal Probability Distributions 1 Chapter 6 Overview Introduction 6-1 Normal Probability Distributions 6-2 The Standard Normal Distribution 6-3 Applications of the Normal Distribution 6-5

More information

SEQUENTIAL DECISION PROBLEM WITH PARTIAL MAINTENANCE ON A PARTIALLY OBSERVABLE MARKOV PROCESS. Toru Nakai. Received February 22, 2010

SEQUENTIAL DECISION PROBLEM WITH PARTIAL MAINTENANCE ON A PARTIALLY OBSERVABLE MARKOV PROCESS. Toru Nakai. Received February 22, 2010 Scientiae Mathematicae Japonicae Online, e-21, 283 292 283 SEQUENTIAL DECISION PROBLEM WITH PARTIAL MAINTENANCE ON A PARTIALLY OBSERVABLE MARKOV PROCESS Toru Nakai Received February 22, 21 Abstract. In

More information

GOV 2001/ 1002/ E-200 Section 3 Inference and Likelihood

GOV 2001/ 1002/ E-200 Section 3 Inference and Likelihood GOV 2001/ 1002/ E-200 Section 3 Inference and Likelihood Anton Strezhnev Harvard University February 10, 2016 1 / 44 LOGISTICS Reading Assignment- Unifying Political Methodology ch 4 and Eschewing Obfuscation

More information

Outline. Review Continuation of exercises from last time

Outline. Review Continuation of exercises from last time Bayesian Models II Outline Review Continuation of exercises from last time 2 Review of terms from last time Probability density function aka pdf or density Likelihood function aka likelihood Conditional

More information

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman: Math 224 Fall 207 Homework 5 Drew Armstrong Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman: Section 3., Exercises 3, 0. Section 3.3, Exercises 2, 3, 0,.

More information

EVA Tutorial #1 BLOCK MAXIMA APPROACH IN HYDROLOGIC/CLIMATE APPLICATIONS. Rick Katz

EVA Tutorial #1 BLOCK MAXIMA APPROACH IN HYDROLOGIC/CLIMATE APPLICATIONS. Rick Katz 1 EVA Tutorial #1 BLOCK MAXIMA APPROACH IN HYDROLOGIC/CLIMATE APPLICATIONS Rick Katz Institute for Mathematics Applied to Geosciences National Center for Atmospheric Research Boulder, CO USA email: rwk@ucar.edu

More information

The Analytics of Information and Uncertainty Answers to Exercises and Excursions

The Analytics of Information and Uncertainty Answers to Exercises and Excursions The Analytics of Information and Uncertainty Answers to Exercises and Excursions Chapter 6: Information and Markets 6.1 The inter-related equilibria of prior and posterior markets Solution 6.1.1. The condition

More information

UNIFORM BOUNDS FOR BLACK SCHOLES IMPLIED VOLATILITY

UNIFORM BOUNDS FOR BLACK SCHOLES IMPLIED VOLATILITY UNIFORM BOUNDS FOR BLACK SCHOLES IMPLIED VOLATILITY MICHAEL R. TEHRANCHI UNIVERSITY OF CAMBRIDGE Abstract. The Black Scholes implied total variance function is defined by V BS (k, c) = v Φ ( k/ v + v/2

More information

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions. ME3620 Theory of Engineering Experimentation Chapter III. Random Variables and Probability Distributions Chapter III 1 3.2 Random Variables In an experiment, a measurement is usually denoted by a variable

More information

SYSM 6304: Risk and Decision Analysis Lecture 6: Pricing and Hedging Financial Derivatives

SYSM 6304: Risk and Decision Analysis Lecture 6: Pricing and Hedging Financial Derivatives SYSM 6304: Risk and Decision Analysis Lecture 6: Pricing and Hedging Financial Derivatives M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October

More information

Estimating the Parameters of Closed Skew-Normal Distribution Under LINEX Loss Function

Estimating the Parameters of Closed Skew-Normal Distribution Under LINEX Loss Function Australian Journal of Basic Applied Sciences, 5(7): 92-98, 2011 ISSN 1991-8178 Estimating the Parameters of Closed Skew-Normal Distribution Under LINEX Loss Function 1 N. Abbasi, 1 N. Saffari, 2 M. Salehi

More information

Logarithmic derivatives of densities for jump processes

Logarithmic derivatives of densities for jump processes Logarithmic derivatives of densities for jump processes Atsushi AKEUCHI Osaka City University (JAPAN) June 3, 29 City University of Hong Kong Workshop on Stochastic Analysis and Finance (June 29 - July

More information

1 Bayesian Bias Correction Model

1 Bayesian Bias Correction Model 1 Bayesian Bias Correction Model Assuming that n iid samples {X 1,...,X n }, were collected from a normal population with mean µ and variance σ 2. The model likelihood has the form, P( X µ, σ 2, T n >

More information

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg :

More information

Lecture IV Portfolio management: Efficient portfolios. Introduction to Finance Mathematics Fall Financial mathematics

Lecture IV Portfolio management: Efficient portfolios. Introduction to Finance Mathematics Fall Financial mathematics Lecture IV Portfolio management: Efficient portfolios. Introduction to Finance Mathematics Fall 2014 Reduce the risk, one asset Let us warm up by doing an exercise. We consider an investment with σ 1 =

More information

Random Variables Handout. Xavier Vilà

Random Variables Handout. Xavier Vilà Random Variables Handout Xavier Vilà Course 2004-2005 1 Discrete Random Variables. 1.1 Introduction 1.1.1 Definition of Random Variable A random variable X is a function that maps each possible outcome

More information

6. Continous Distributions

6. Continous Distributions 6. Continous Distributions Chris Piech and Mehran Sahami May 17 So far, all random variables we have seen have been discrete. In all the cases we have seen in CS19 this meant that our RVs could only take

More information

1 Residual life for gamma and Weibull distributions

1 Residual life for gamma and Weibull distributions Supplement to Tail Estimation for Window Censored Processes Residual life for gamma and Weibull distributions. Gamma distribution Let Γ(k, x = x yk e y dy be the upper incomplete gamma function, and let

More information

Stat 6863-Handout 1 Economics of Insurance and Risk June 2008, Maurice A. Geraghty

Stat 6863-Handout 1 Economics of Insurance and Risk June 2008, Maurice A. Geraghty A. The Psychology of Risk Aversion Stat 6863-Handout 1 Economics of Insurance and Risk June 2008, Maurice A. Geraghty Suppose a decision maker has an asset worth $100,000 that has a 1% chance of being

More information

12. Conditional heteroscedastic models (ARCH) MA6622, Ernesto Mordecki, CityU, HK, 2006.

12. Conditional heteroscedastic models (ARCH) MA6622, Ernesto Mordecki, CityU, HK, 2006. 12. Conditional heteroscedastic models (ARCH) MA6622, Ernesto Mordecki, CityU, HK, 2006. References for this Lecture: Robert F. Engle. Autoregressive Conditional Heteroscedasticity with Estimates of Variance

More information

What was in the last lecture?

What was in the last lecture? What was in the last lecture? Normal distribution A continuous rv with bell-shaped density curve The pdf is given by f(x) = 1 2πσ e (x µ)2 2σ 2, < x < If X N(µ, σ 2 ), E(X) = µ and V (X) = σ 2 Standard

More information

IEOR E4703: Monte-Carlo Simulation

IEOR E4703: Monte-Carlo Simulation IEOR E4703: Monte-Carlo Simulation Generating Random Variables and Stochastic Processes Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com

More information

Pre-Algebra, Unit 7: Percents Notes

Pre-Algebra, Unit 7: Percents Notes Pre-Algebra, Unit 7: Percents Notes Percents are special fractions whose denominators are 100. The number in front of the percent symbol (%) is the numerator. The denominator is not written, but understood

More information

Practical Hedging: From Theory to Practice. OSU Financial Mathematics Seminar May 5, 2008

Practical Hedging: From Theory to Practice. OSU Financial Mathematics Seminar May 5, 2008 Practical Hedging: From Theory to Practice OSU Financial Mathematics Seminar May 5, 008 Background Dynamic replication is a risk management technique used to mitigate market risk We hope to spend a certain

More information

Central limit theorems

Central limit theorems Chapter 6 Central limit theorems 6.1 Overview Recall that a random variable Z is said to have a standard normal distribution, denoted by N(0, 1), if it has a continuous distribution with density φ(z) =

More information

Math-Stat-491-Fall2014-Notes-V

Math-Stat-491-Fall2014-Notes-V Math-Stat-491-Fall2014-Notes-V Hariharan Narayanan December 7, 2014 Martingales 1 Introduction Martingales were originally introduced into probability theory as a model for fair betting games. Essentially

More information

Objective Bayesian Analysis for Heteroscedastic Regression

Objective Bayesian Analysis for Heteroscedastic Regression Analysis for Heteroscedastic Regression & Esther Salazar Universidade Federal do Rio de Janeiro Colóquio Inter-institucional: Modelos Estocásticos e Aplicações 2009 Collaborators: Marco Ferreira and Thais

More information

Stochastic Volatility (SV) Models

Stochastic Volatility (SV) Models 1 Motivations Stochastic Volatility (SV) Models Jun Yu Some stylised facts about financial asset return distributions: 1. Distribution is leptokurtic 2. Volatility clustering 3. Volatility responds to

More information

Homework Assignments

Homework Assignments Homework Assignments Week 1 (p. 57) #4.1, 4., 4.3 Week (pp 58 6) #4.5, 4.6, 4.8(a), 4.13, 4.0, 4.6(b), 4.8, 4.31, 4.34 Week 3 (pp 15 19) #1.9, 1.1, 1.13, 1.15, 1.18 (pp 9 31) #.,.6,.9 Week 4 (pp 36 37)

More information

Chapter 5. Statistical inference for Parametric Models

Chapter 5. Statistical inference for Parametric Models Chapter 5. Statistical inference for Parametric Models Outline Overview Parameter estimation Method of moments How good are method of moments estimates? Interval estimation Statistical Inference for Parametric

More information

Probabilistic Meshless Methods for Bayesian Inverse Problems. Jon Cockayne July 8, 2016

Probabilistic Meshless Methods for Bayesian Inverse Problems. Jon Cockayne July 8, 2016 Probabilistic Meshless Methods for Bayesian Inverse Problems Jon Cockayne July 8, 2016 1 Co-Authors Chris Oates Tim Sullivan Mark Girolami 2 What is PN? Many problems in mathematics have no analytical

More information

Greek Maxima 1 by Michael B. Miller

Greek Maxima 1 by Michael B. Miller Greek Maxima by Michael B. Miller When managing the risk of options it is often useful to know how sensitivities will change over time and with the price of the underlying. For example, many people know

More information

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage 6 Point Estimation Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage Point Estimation Statistical inference: directed toward conclusions about one or more parameters. We will use the generic

More information

The Normal Distribution

The Normal Distribution The Normal Distribution The normal distribution plays a central role in probability theory and in statistics. It is often used as a model for the distribution of continuous random variables. Like all models,

More information

Non replication of options

Non replication of options Non replication of options Christos Kountzakis, Ioannis A Polyrakis and Foivos Xanthos June 30, 2008 Abstract In this paper we study the scarcity of replication of options in the two period model of financial

More information

Option Pricing. Chapter Discrete Time

Option Pricing. Chapter Discrete Time Chapter 7 Option Pricing 7.1 Discrete Time In the next section we will discuss the Black Scholes formula. To prepare for that, we will consider the much simpler problem of pricing options when there are

More information

Decision theoretic estimation of the ratio of variances in a bivariate normal distribution 1

Decision theoretic estimation of the ratio of variances in a bivariate normal distribution 1 Decision theoretic estimation of the ratio of variances in a bivariate normal distribution 1 George Iliopoulos Department of Mathematics University of Patras 26500 Rio, Patras, Greece Abstract In this

More information

An Improved Skewness Measure

An Improved Skewness Measure An Improved Skewness Measure Richard A. Groeneveld Professor Emeritus, Department of Statistics Iowa State University ragroeneveld@valley.net Glen Meeden School of Statistics University of Minnesota Minneapolis,

More information

Metropolis-Hastings algorithm

Metropolis-Hastings algorithm Metropolis-Hastings algorithm Dr. Jarad Niemi STAT 544 - Iowa State University March 27, 2018 Jarad Niemi (STAT544@ISU) Metropolis-Hastings March 27, 2018 1 / 32 Outline Metropolis-Hastings algorithm Independence

More information

Financial Time Series and Their Characterictics

Financial Time Series and Their Characterictics Financial Time Series and Their Characterictics Mei-Yuan Chen Department of Finance National Chung Hsing University Feb. 22, 2013 Contents 1 Introduction 1 1.1 Asset Returns..............................

More information

4: Single Cash Flows and Equivalence

4: Single Cash Flows and Equivalence 4.1 Single Cash Flows and Equivalence Basic Concepts 28 4: Single Cash Flows and Equivalence This chapter explains basic concepts of project economics by examining single cash flows. This means that each

More information

Modelling Environmental Extremes

Modelling Environmental Extremes 19th TIES Conference, Kelowna, British Columbia 8th June 2008 Topics for the day 1. Classical models and threshold models 2. Dependence and non stationarity 3. R session: weather extremes 4. Multivariate

More information