START HERE: Instructions. 1 Exponential Family [Zhou, Manzil]
|
|
- George Patterson
- 5 years ago
- Views:
Transcription
1 START HERE: Instructions Thanks a lot to John A.W.B. Constanzo and Shi Zong for providing and allowing to use the latex source files for quick preparation of the HW solution. The homework was due at 9:00am on Feb 3, 05. Anything that is received after that time will not be considered. s to every theory questions will be also submitted electronically on Autolab PDF: Latex or handwritten and scanned. Make sure you prepare the answers to each question separately. Collaboration on solving the homework is allowed after you have thought about the problems on your own. However, when you do collaborate, you should list your collaborators! You might also have gotten some inspiration from resources books or online etc... This might be OK only after you have tried to solve the problem, and couldn t. In such a case, you should cite your resources. If you do collaborate with someone or use a book or website, you are expected to write up your solution independently. That is, close the book and all of your notes before starting to write up your solution. Latex source of this homework: hw5_latex.tar Exponential Family [Zhou, Manzil] In this problem we will review the exponential family, its significance in Bayesian statistics and work out a detailed example for the commonly encountered Multivariate Normal distribution and its conjugate prior Normal Inverse Wishart Distribution.. Review Exponential family is a set of probability distributions whose probability density function for x R d can be expressed in the form: px θ = exp φx, θ T gθ where φx is a sufficient statistic of the distribution. For exponential families, the sufficient statistic is a function of the data that fully summarizes the data x within the density function. The sufficient statistic of a set of independent identically distributed data observations is simply the sum of individual sufficient statistics, and encapsulates all the information needed to describe the posterior distribution of the parameters, given the data and hence to derive any desired estimate of the parameters. We will explore this important property in detail below. θ is called the natural parameter. The set of values of θ for which the function px θ is finite is called the natural parameter space. It can be shown that the natural parameter space is always convex. First show that log-partition function gθ is a convex function, then you can show this from first principles. gθ is called the log-partition function because it is the logarithm of a normalization factor, without which px θ would not be a probability distribution partition function is often used as a synonym of normalization factor for historical reasons arising from Statistical Physics.
2 . Conjugate Priors [0+0+0] Exponential families are very important in Bayesian statistics. In Bayesian statistics a prior distribution is multiplied by a likelihood function and then normalised to produce a posterior distribution. In the case of a likelihood which belongs to the exponential family there always exists a conjugate prior, which is also in the exponential family. Consider the distribution: pθ; m 0, φ 0 = exp φ 0, θ m 0, gθ hm 0, φ 0 where m 0 > 0 and φ 0 R d. These are called hyperparameters parameters controlling parameters. Question Show that this distribution, i.e. is a member of the Exponential Family. There is not much to show. Note pθ; m 0, φ 0 = exp θ The sufficient statistic is gθ φ0 The natural parameter is m 0 θ φ0, hm gθ m 0, φ The log-partition function is hm 0, φ 0. There exist infinitely many splitting into ĝ for which h = T ĝ
3 Suppose we obtain the data X = x,..., x n, where x i p θ, i.e. each single observation follows some distribution from the exponential family. Question First of all write out the likelihood px θ. Then use as the prior and derive the posterior pθ X exactly, i.e. with proper normalization constant. The likelihood turns out to be simply n n px θ = px i θ = exp φx i, θ gθ i= i= n = exp φx i, θ i= gθ i= n = exp φx i, θ n T gθ i= 4 Now observe that h is defined so that for all x, y in the hyperparameter space, x θ exp, hy, x dθ =. 5 y gθ Keeping this in mind, we proceed to compute the posterior as pθ X px θpθ; m 0, φ 0 n exp φx i, θ n T gθ + i= φ0 m 0 φ0 + n = exp i= φx i θ,. m 0 + n gθ θ, gθ 6 By 5, normalizing yields [ φ0 + n pθ X = exp i= φx i θ, h m m 0 + n gθ 0 + n, φ 0 + ] φx i. 7 i= 3
4 If you got Question correct hopefully you did, observe that the posterior has the same form as the prior, thus is a conjugate prior. The difference between the prior, i.e. and your answer to Question lies only in the parameters. Question 3 Let m n and φ n be parameters of the posterior pθ X, then show that: m n = m 0 + n φ n = φ 0 + φx i i= 8 We call this update equations. This is obvious from 7. Specifically, comparing equation 7 with prior distribution, we can see that posses the same form and having only the following difference in parameters: m n = m 0 + n φ n = φ 0 + φx i i= 9 This shows that the update equations can be written simply in terms of the number of data points and the sufficient statistic of the data. Also, it provides meaning to the hyperparameters. In particular, m 0 corresponds to the effective number of fake observations that the prior distribution contributes, and φ 0 corresponds to the total amount that these fake observations contribute to the sufficient statistic over all observations and fake observations. 4
5 .3 Multivariate Normal Distribution [0+0+0] The Multivariate Normal N µ, Σ is a distribution that is encountered very often. The distribution is given by: px µ, Σ = πd Σ exp x µt Σ x µ 0 where µ R d and Σ 0 is a symmetric positive definite d d matrix. We claim that it belongs to the Exponential Family. Question 4 Identify the natural parameters θ in terms of µ and Σ. Also derive the sufficient statistics φx and log partition function gθ in terms of µ and Σ. Hint: Design a two dimensional gθ, where first dimension is µt Σ µ. We will use the notation A, B = veca, vecb = trab when A and B are matrices. Note that x T Σ x = trx T Σ x Now px µ, Σ = exp = exp = trxx T Σ = xx T, Σ. x T Σ x x T Σ µ + µ T Σ µ + log[π d Σ ] [ µ xx T, Σ x, Σ µ + T T Σ µ log[π d Σ ] [ x Σ = exp xx T, µ T Σ ] µt Σ µ. log[πd Σ ] The natural parameters, sufficient statistics and log partition function are obtained by inspection as: So we get to know that the natural parameter is, Σ µ Σ ] The sufficient statistics is, The log-partition function is, x xx T µt Σ µ d logπ + log Σ 5
6 The conjugate prior for Multivariate Normal Distribution can be parametrized as the Normal Inverse Wishart Distribution N IWµ 0, κ 0, Σ 0, ν 0. The distribution is given by: pµ, Σ; µ 0, κ 0, Σ 0, ν 0 = N µ µ 0, Σ/κ 0 W Σ Σ 0, ν 0 = κ d 0 Σ 0 ν 0 Σ ν0+d+ ν 0 +d π d Γ d ν0 e κ 0 µ µ 0 T Σ µ µ 0 trσ0σ 3 where κ 0, ν 0 > 0, µ 0 R d and Σ 0 0 is a symmetric positive definite d d matrix., and Γ d is the multivariate gamma function. Question 5 Notice that Normal Inverse Wishart Distribution will fit into the form of. Find the mapping between µ 0, κ 0, Σ 0, ν 0 and m 0, φ 0 and the function hm 0, φ 0 in terms of µ 0, κ 0, Σ 0, ν 0. Hint: m 0 and gθ is two dimensional. A bit of algebra shows that: κ0 pµ, Σ = exp µt Σ µ + κ 0 µ T 0 Σ µ κ 0 µt 0 Σ µ 0 trσ 0Σ + = exp d log κ 0 + ν 0 log Σ 0 ν 0 + d + κ 0 µ 0 κ 0 µ 0 µ T 0 + Σ 0 ν 0 + d + Σ, µ Σ log[π d ] v 0 + log Σ ν 0 + d log d log π log Γ d κ 0, µt Σ µ d ν 0 + d + logπ + + log Σ ] ν0 log d d log π + d log κ 0 + ν 0 log Σ ν0 0 log Γ d Comparing terms with, we obtain κ m 0 = 0 ν 0 + d + κ φ 0 = 0 µ 0 Σ 0 + κ 0 µ 0 µ T 0 dd + hm 0, φ 0 = logπ ν 0d logπ d logκ 0 ν 0 log Σ ν0 0 + log Γ d. 4 6
7 Equipped with these, results we move on to tackle the problem of finding posterior for µ, Σ. One can follow brute force approach to find it be using 0 and 3, but things can get really messy. We will adopt a more elegant and easy approach exploiting the fact that these distribution belong to the exponential family. Question 6 Using the update equations described in Question 3 and your answers to Question 4 and 5, directly write down the posterior for pµ, Σ X. Just providing appropriate update equations would suffice. Applying update equations 8 in Question 3, first considering m n κ m n = n κ = 0 + n 5 ν n + d + ν 0 + d + Similarly for φ n we have κ φ n = n µ n Σ n + κ n µ n µ T = n κ 0 µ 0 Σ 0 + κ 0 µ 0 µ T 0 n + i= x i n i= x ix T i 6 Thus, we obtain the following update equations in terms of µ 0, κ 0, Σ 0, ν 0 : κ n = κ 0 + n µ n = κ 0µ 0 + n x κ 0 + n ν n = ν 0 + n Σ n = Σ 0 + x i x T i i= + κ 0 µ 0 µ T 0 κ n µ n µ T n 7 where x = n n i= x i. This is one of the remarkable cases where working out the general case saves you effort than working out the special case! The algebra can become very complicated, e.g. see murphyk/papers/bayesgauss.pdf where they have explicitly done complicated math! We hope that after solving this homework, you can take advantage of this neat short-cut : 7
8 If we want the same expression as Wikipedia for Σ n, we need to the following rearrangements: Σ n = Σ 0 + x i x T i + κ 0 µ 0 µ T 0 κ n µ n µ T n i= = Σ 0 + C + x i x T + i= xx T i i= x x T + κ 0 µ 0 µ T 0 κ n µ n µ T n i= = Σ 0 + C + n x x T + κ 0 µ 0 µ T 0 κ n µ n µ T n where C = n i= x i xx i x T. Now further expanding µ n, we obtain: 8 Σ n = Σ 0 + C + n x x T + κ 0 µ 0 µ T 0 κ n µ n µ T n κ 0 µ 0 + n xκ 0 µ 0 + n x T = Σ 0 + C + κ 0 µ 0 µ T 0 + +n x x T κ 0 + n nκ0 = Σ 0 + C + µ 0 µ T 0 nκ 0 µ 0 x T nκ 0 xµ T 0 + nκ 0 x x T κ 0 + n = Σ 0 + C + nκ0 κ 0 + n x µ 0 x µ 0 T 9 8
9 .4 Posterior Predictive - Bonus [0+0] Another quantify, which is often of interest in Bayesian Statistics, is the posterior predictive. The posterior predictive distribution is the distribution of unobserved observations prediction conditional on the observed data. Specifically, it is computed by marginalising over the parameters, using the posterior distribution: p x X = p x θ pθ X dθ 0 The posterior predictive distribution for a distribution in exponential family has a rather nice form. Question 7 Show that the posterior predictive for the distribution with prior is given by: p x X = exp {h m n +, φ n + φ x h m n, φ n } Since p x X = exp [ φ x, θ T gθ + φ n, θ m n, gθ hm n, φ n ] dθ = exp [ φ x + φ n, θ m n +, gθ hm n, φ n ] dθ = exp [ hm n, φ n ] exp [ φ x + φ n, θ m n +, gθ ] dθ. exp [ φ x + φ n, θ m n +, gθ hm n +, φ n + φ x] = p θ; m n +, φ n + φ x 3 is a probability distribution, we have = exp [ φ x + φ n, θ m n +, gθ h m n +, φ n + φ x] dθ = exp [ h m n +, φ n + φ x] exp [ φ x + φ n, θ m n +, gθ ] dθ, 4 leading to exp [ φ x + φ n, θ m n +, gθ ] dθ = exp [h m n +, φ n + φ x]. 5 Combining this result with yields the desired result: p x X = exp {h m n +, φ n + φ x h m n, φ n } 6 9
10 The result of previous problem can be specialized for the Multivariate Normal case. Question 8 Find the predictive posterior for the case of Multivariate Normal Distribution with Normal Inverse Wishart Distribution, having parameters as described in.3 by using Question 7. Hint: The matrix determinant lemma might come handy determinant_lemma. The adding of new point x leads to following update: κ = κ n + ν = ν n + µ = κ nµ n + x κ n + Σ = Σ n + x x T + κ n µ n µ T n κ µ µ T = Σ n + Now by substituting 4 into, exphm n +, φ n + φ x exphm n, φ n κ n κ n + x µ n x µ n T dd+ = π π νn+d κ n + d Σ νn+ Γ νn+ d π dd+ π νnd κ n d Σ n νn Γ νn d = Γ ν n+ Γ ν n d+ d π d + κ n Σ n νn Σ νn+ Next rewrite Σ using matrix determinant lemma, Σ = Σ n + κ n κ n + x µ n x µ n T [ = + κ ] n κ n + x µ n T Σ n x µ n Σ n Putting all together, we get Γ ν n+ p x X = Γ d/. 30 νn+/ ν n d+ π d/ κ n+ κ Σn n / + κn κ x µ n+ n T Σ n x µ n Further using student t distribution formula, one can show that p x X = x µ n, t νn d+ κ n+σ n κ nν n d+ 0
Posterior Inference. , where should we start? Consider the following computational procedure: 1. draw samples. 2. convert. 3. compute properties
Posterior Inference Example. Consider a binomial model where we have a posterior distribution for the probability term, θ. Suppose we want to make inferences about the log-odds γ = log ( θ 1 θ), where
More informationBayesian Linear Model: Gory Details
Bayesian Linear Model: Gory Details Pubh7440 Notes By Sudipto Banerjee Let y y i ] n i be an n vector of independent observations on a dependent variable (or response) from n experimental units. Associated
More informationNon-informative Priors Multiparameter Models
Non-informative Priors Multiparameter Models Statistics 220 Spring 2005 Copyright c 2005 by Mark E. Irwin Prior Types Informative vs Non-informative There has been a desire for a prior distributions that
More informationCS340 Machine learning Bayesian statistics 3
CS340 Machine learning Bayesian statistics 3 1 Outline Conjugate analysis of µ and σ 2 Bayesian model selection Summarizing the posterior 2 Unknown mean and precision The likelihood function is p(d µ,λ)
More informationChapter 7: Estimation Sections
1 / 40 Chapter 7: Estimation Sections 7.1 Statistical Inference Bayesian Methods: Chapter 7 7.2 Prior and Posterior Distributions 7.3 Conjugate Prior Distributions 7.4 Bayes Estimators Frequentist Methods:
More informationChapter 7: Estimation Sections
1 / 31 : Estimation Sections 7.1 Statistical Inference Bayesian Methods: 7.2 Prior and Posterior Distributions 7.3 Conjugate Prior Distributions 7.4 Bayes Estimators Frequentist Methods: 7.5 Maximum Likelihood
More informationST440/550: Applied Bayesian Analysis. (5) Multi-parameter models - Summarizing the posterior
(5) Multi-parameter models - Summarizing the posterior Models with more than one parameter Thus far we have studied single-parameter models, but most analyses have several parameters For example, consider
More informationLikelihood Methods of Inference. Toss coin 6 times and get Heads twice.
Methods of Inference Toss coin 6 times and get Heads twice. p is probability of getting H. Probability of getting exactly 2 heads is 15p 2 (1 p) 4 This function of p, is likelihood function. Definition:
More informationChapter 8: Sampling distributions of estimators Sections
Chapter 8 continued Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square distributions 8.3 Joint Distribution of the sample mean and sample
More informationChapter 4: Asymptotic Properties of MLE (Part 3)
Chapter 4: Asymptotic Properties of MLE (Part 3) Daniel O. Scharfstein 09/30/13 1 / 1 Breakdown of Assumptions Non-Existence of the MLE Multiple Solutions to Maximization Problem Multiple Solutions to
More informationConjugate priors: Beta and normal Class 15, Jeremy Orloff and Jonathan Bloom
1 Learning Goals Conjugate s: Beta and normal Class 15, 18.05 Jeremy Orloff and Jonathan Bloom 1. Understand the benefits of conjugate s.. Be able to update a beta given a Bernoulli, binomial, or geometric
More information12 The Bootstrap and why it works
12 he Bootstrap and why it works For a review of many applications of bootstrap see Efron and ibshirani (1994). For the theory behind the bootstrap see the books by Hall (1992), van der Waart (2000), Lahiri
More informationLecture 17. The model is parametrized by the time period, δt, and three fixed constant parameters, v, σ and the riskless rate r.
Lecture 7 Overture to continuous models Before rigorously deriving the acclaimed Black-Scholes pricing formula for the value of a European option, we developed a substantial body of material, in continuous
More informationBayesian Hierarchical/ Multilevel and Latent-Variable (Random-Effects) Modeling
Bayesian Hierarchical/ Multilevel and Latent-Variable (Random-Effects) Modeling 1: Formulation of Bayesian models and fitting them with MCMC in WinBUGS David Draper Department of Applied Mathematics and
More informationSTAT 425: Introduction to Bayesian Analysis
STAT 45: Introduction to Bayesian Analysis Marina Vannucci Rice University, USA Fall 018 Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall 018 1 / 37 Lectures 9-11: Multi-parameter
More informationProbability. An intro for calculus students P= Figure 1: A normal integral
Probability An intro for calculus students.8.6.4.2 P=.87 2 3 4 Figure : A normal integral Suppose we flip a coin 2 times; what is the probability that we get more than 2 heads? Suppose we roll a six-sided
More information(5) Multi-parameter models - Summarizing the posterior
(5) Multi-parameter models - Summarizing the posterior Spring, 2017 Models with more than one parameter Thus far we have studied single-parameter models, but most analyses have several parameters For example,
More informationExtend the ideas of Kan and Zhou paper on Optimal Portfolio Construction under parameter uncertainty
Extend the ideas of Kan and Zhou paper on Optimal Portfolio Construction under parameter uncertainty George Photiou Lincoln College University of Oxford A dissertation submitted in partial fulfilment for
More informationLecture Notes 6. Assume F belongs to a family of distributions, (e.g. F is Normal), indexed by some parameter θ.
Sufficient Statistics Lecture Notes 6 Sufficiency Data reduction in terms of a particular statistic can be thought of as a partition of the sample space X. Definition T is sufficient for θ if the conditional
More informationTwo hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER
Two hours MATH20802 To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER STATISTICAL METHODS Answer any FOUR of the SIX questions.
More informationProblem Set 4 Answers
Business 3594 John H. Cochrane Problem Set 4 Answers ) a) In the end, we re looking for ( ) ( ) + This suggests writing the portfolio as an investment in the riskless asset, then investing in the risky
More informationBayesian Normal Stuff
Bayesian Normal Stuff - Set-up of the basic model of a normally distributed random variable with unknown mean and variance (a two-parameter model). - Discuss philosophies of prior selection - Implementation
More informationThe rth moment of a real-valued random variable X with density f(x) is. x r f(x) dx
1 Cumulants 1.1 Definition The rth moment of a real-valued random variable X with density f(x) is µ r = E(X r ) = x r f(x) dx for integer r = 0, 1,.... The value is assumed to be finite. Provided that
More informationModule 2: Monte Carlo Methods
Module 2: Monte Carlo Methods Prof. Mike Giles mike.giles@maths.ox.ac.uk Oxford University Mathematical Institute MC Lecture 2 p. 1 Greeks In Monte Carlo applications we don t just want to know the expected
More informationMTH6154 Financial Mathematics I Stochastic Interest Rates
MTH6154 Financial Mathematics I Stochastic Interest Rates Contents 4 Stochastic Interest Rates 45 4.1 Fixed Interest Rate Model............................ 45 4.2 Varying Interest Rate Model...........................
More informationMA 1125 Lecture 12 - Mean and Standard Deviation for the Binomial Distribution. Objectives: Mean and standard deviation for the binomial distribution.
MA 5 Lecture - Mean and Standard Deviation for the Binomial Distribution Friday, September 9, 07 Objectives: Mean and standard deviation for the binomial distribution.. Mean and Standard Deviation of the
More informationConjugate Models. Patrick Lam
Conjugate Models Patrick Lam Outline Conjugate Models What is Conjugacy? The Beta-Binomial Model The Normal Model Normal Model with Unknown Mean, Known Variance Normal Model with Known Mean, Unknown Variance
More informationApplications of Exponential Functions Group Activity 7 Business Project Week #10
Applications of Exponential Functions Group Activity 7 Business Project Week #10 In the last activity we looked at exponential functions. This week we will look at exponential functions as related to interest
More informationChapter 7: Estimation Sections
Chapter 7: Estimation Sections 7.1 Statistical Inference Bayesian Methods: 7.2 Prior and Posterior Distributions 7.3 Conjugate Prior Distributions Frequentist Methods: 7.5 Maximum Likelihood Estimators
More informationCSC 411: Lecture 08: Generative Models for Classification
CSC 411: Lecture 08: Generative Models for Classification Richard Zemel, Raquel Urtasun and Sanja Fidler University of Toronto Zemel, Urtasun, Fidler (UofT) CSC 411: 08-Generative Models 1 / 23 Today Classification
More informationCS340 Machine learning Bayesian model selection
CS340 Machine learning Bayesian model selection Bayesian model selection Suppose we have several models, each with potentially different numbers of parameters. Example: M0 = constant, M1 = straight line,
More informationBack to estimators...
Back to estimators... So far, we have: Identified estimators for common parameters Discussed the sampling distributions of estimators Introduced ways to judge the goodness of an estimator (bias, MSE, etc.)
More informationIEOR 3106: Introduction to OR: Stochastic Models. Fall 2013, Professor Whitt. Class Lecture Notes: Tuesday, September 10.
IEOR 3106: Introduction to OR: Stochastic Models Fall 2013, Professor Whitt Class Lecture Notes: Tuesday, September 10. The Central Limit Theorem and Stock Prices 1. The Central Limit Theorem (CLT See
More informationMartingales. by D. Cox December 2, 2009
Martingales by D. Cox December 2, 2009 1 Stochastic Processes. Definition 1.1 Let T be an arbitrary index set. A stochastic process indexed by T is a family of random variables (X t : t T) defined on a
More informationDefinition 9.1 A point estimate is any function T (X 1,..., X n ) of a random sample. We often write an estimator of the parameter θ as ˆθ.
9 Point estimation 9.1 Rationale behind point estimation When sampling from a population described by a pdf f(x θ) or probability function P [X = x θ] knowledge of θ gives knowledge of the entire population.
More informationExtracting Information from the Markets: A Bayesian Approach
Extracting Information from the Markets: A Bayesian Approach Daniel Waggoner The Federal Reserve Bank of Atlanta Florida State University, February 29, 2008 Disclaimer: The views expressed are the author
More informationThe Normal Distribution
Will Monroe CS 09 The Normal Distribution Lecture Notes # July 9, 207 Based on a chapter by Chris Piech The single most important random variable type is the normal a.k.a. Gaussian) random variable, parametrized
More informationROM SIMULATION Exact Moment Simulation using Random Orthogonal Matrices
ROM SIMULATION Exact Moment Simulation using Random Orthogonal Matrices Bachelier Finance Society Meeting Toronto 2010 Henley Business School at Reading Contact Author : d.ledermann@icmacentre.ac.uk Alexander
More informationPractice Exercises for Midterm Exam ST Statistical Theory - II The ACTUAL exam will consists of less number of problems.
Practice Exercises for Midterm Exam ST 522 - Statistical Theory - II The ACTUAL exam will consists of less number of problems. 1. Suppose X i F ( ) for i = 1,..., n, where F ( ) is a strictly increasing
More informationarxiv: v1 [math.st] 18 Sep 2018
Gram Charlier and Edgeworth expansion for sample variance arxiv:809.06668v [math.st] 8 Sep 08 Eric Benhamou,* A.I. SQUARE CONNECT, 35 Boulevard d Inkermann 900 Neuilly sur Seine, France and LAMSADE, Universit
More informationMachine Learning for Quantitative Finance
Machine Learning for Quantitative Finance Fast derivative pricing Sofie Reyners Joint work with Jan De Spiegeleer, Dilip Madan and Wim Schoutens Derivative pricing is time-consuming... Vanilla option pricing
More informationCharacterization of the Optimum
ECO 317 Economics of Uncertainty Fall Term 2009 Notes for lectures 5. Portfolio Allocation with One Riskless, One Risky Asset Characterization of the Optimum Consider a risk-averse, expected-utility-maximizing
More informationSTOR Lecture 15. Jointly distributed Random Variables - III
STOR 435.001 Lecture 15 Jointly distributed Random Variables - III Jan Hannig UNC Chapel Hill 1 / 17 Before we dive in Contents of this lecture 1. Conditional pmf/pdf: definition and simple properties.
More informationLecture 10: Point Estimation
Lecture 10: Point Estimation MSU-STT-351-Sum-17B (P. Vellaisamy: MSU-STT-351-Sum-17B) Probability & Statistics for Engineers 1 / 31 Basic Concepts of Point Estimation A point estimate of a parameter θ,
More informationEE641 Digital Image Processing II: Purdue University VISE - October 29,
EE64 Digital Image Processing II: Purdue University VISE - October 9, 004 The EM Algorithm. Suffient Statistics and Exponential Distributions Let p(y θ) be a family of density functions parameterized by
More information5.3 Interval Estimation
5.3 Interval Estimation Ulrich Hoensch Wednesday, March 13, 2013 Confidence Intervals Definition Let θ be an (unknown) population parameter. A confidence interval with confidence level C is an interval
More informationProbability Theory. Probability and Statistics for Data Science CSE594 - Spring 2016
Probability Theory Probability and Statistics for Data Science CSE594 - Spring 2016 What is Probability? 2 What is Probability? Examples outcome of flipping a coin (seminal example) amount of snowfall
More informationSTAT 830 Convergence in Distribution
STAT 830 Convergence in Distribution Richard Lockhart Simon Fraser University STAT 830 Fall 2013 Richard Lockhart (Simon Fraser University) STAT 830 Convergence in Distribution STAT 830 Fall 2013 1 / 31
More information2.1 Mean-variance Analysis: Single-period Model
Chapter Portfolio Selection The theory of option pricing is a theory of deterministic returns: we hedge our option with the underlying to eliminate risk, and our resulting risk-free portfolio then earns
More informationUQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions.
UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions. Random Variables 2 A random variable X is a numerical (integer, real, complex, vector etc.) summary of the outcome of the random experiment.
More informationGenerating Random Numbers
Generating Random Numbers Aim: produce random variables for given distribution Inverse Method Let F be the distribution function of an univariate distribution and let F 1 (y) = inf{x F (x) y} (generalized
More informationVersion A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.
Math 224 Q Exam 3A Fall 217 Tues Dec 12 Version A Problem 1. Let X be the continuous random variable defined by the following pdf: { 1 x/2 when x 2, f(x) otherwise. (a) Compute the mean µ E[X]. E[X] x
More informationcontinuous rv Note for a legitimate pdf, we have f (x) 0 and f (x)dx = 1. For a continuous rv, P(X = c) = c f (x)dx = 0, hence
continuous rv Let X be a continuous rv. Then a probability distribution or probability density function (pdf) of X is a function f(x) such that for any two numbers a and b with a b, P(a X b) = b a f (x)dx.
More informationBivariate Birnbaum-Saunders Distribution
Department of Mathematics & Statistics Indian Institute of Technology Kanpur January 2nd. 2013 Outline 1 Collaborators 2 3 Birnbaum-Saunders Distribution: Introduction & Properties 4 5 Outline 1 Collaborators
More informationدرس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی
یادگیري ماشین توزیع هاي نمونه و تخمین نقطه اي پارامترها Sampling Distributions and Point Estimation of Parameter (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی درس هفتم 1 Outline Introduction
More informationCSE 312 Winter Learning From Data: Maximum Likelihood Estimators (MLE)
CSE 312 Winter 2017 Learning From Data: Maximum Likelihood Estimators (MLE) 1 Parameter Estimation Given: independent samples x1, x2,..., xn from a parametric distribution f(x θ) Goal: estimate θ. Not
More informationCourse information FN3142 Quantitative finance
Course information 015 16 FN314 Quantitative finance This course is aimed at students interested in obtaining a thorough grounding in market finance and related empirical methods. Prerequisite If taken
More informationChapter 3 Common Families of Distributions. Definition 3.4.1: A family of pmfs or pdfs is called exponential family if it can be expressed as
Lecture 0 on BST 63: Statistical Theory I Kui Zhang, 09/9/008 Review for the previous lecture Definition: Several continuous distributions, including uniform, gamma, normal, Beta, Cauchy, double exponential
More informationChapter 6. The Normal Probability Distributions
Chapter 6 The Normal Probability Distributions 1 Chapter 6 Overview Introduction 6-1 Normal Probability Distributions 6-2 The Standard Normal Distribution 6-3 Applications of the Normal Distribution 6-5
More informationSEQUENTIAL DECISION PROBLEM WITH PARTIAL MAINTENANCE ON A PARTIALLY OBSERVABLE MARKOV PROCESS. Toru Nakai. Received February 22, 2010
Scientiae Mathematicae Japonicae Online, e-21, 283 292 283 SEQUENTIAL DECISION PROBLEM WITH PARTIAL MAINTENANCE ON A PARTIALLY OBSERVABLE MARKOV PROCESS Toru Nakai Received February 22, 21 Abstract. In
More informationGOV 2001/ 1002/ E-200 Section 3 Inference and Likelihood
GOV 2001/ 1002/ E-200 Section 3 Inference and Likelihood Anton Strezhnev Harvard University February 10, 2016 1 / 44 LOGISTICS Reading Assignment- Unifying Political Methodology ch 4 and Eschewing Obfuscation
More informationOutline. Review Continuation of exercises from last time
Bayesian Models II Outline Review Continuation of exercises from last time 2 Review of terms from last time Probability density function aka pdf or density Likelihood function aka likelihood Conditional
More informationProblems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:
Math 224 Fall 207 Homework 5 Drew Armstrong Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman: Section 3., Exercises 3, 0. Section 3.3, Exercises 2, 3, 0,.
More informationEVA Tutorial #1 BLOCK MAXIMA APPROACH IN HYDROLOGIC/CLIMATE APPLICATIONS. Rick Katz
1 EVA Tutorial #1 BLOCK MAXIMA APPROACH IN HYDROLOGIC/CLIMATE APPLICATIONS Rick Katz Institute for Mathematics Applied to Geosciences National Center for Atmospheric Research Boulder, CO USA email: rwk@ucar.edu
More informationThe Analytics of Information and Uncertainty Answers to Exercises and Excursions
The Analytics of Information and Uncertainty Answers to Exercises and Excursions Chapter 6: Information and Markets 6.1 The inter-related equilibria of prior and posterior markets Solution 6.1.1. The condition
More informationUNIFORM BOUNDS FOR BLACK SCHOLES IMPLIED VOLATILITY
UNIFORM BOUNDS FOR BLACK SCHOLES IMPLIED VOLATILITY MICHAEL R. TEHRANCHI UNIVERSITY OF CAMBRIDGE Abstract. The Black Scholes implied total variance function is defined by V BS (k, c) = v Φ ( k/ v + v/2
More informationME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.
ME3620 Theory of Engineering Experimentation Chapter III. Random Variables and Probability Distributions Chapter III 1 3.2 Random Variables In an experiment, a measurement is usually denoted by a variable
More informationSYSM 6304: Risk and Decision Analysis Lecture 6: Pricing and Hedging Financial Derivatives
SYSM 6304: Risk and Decision Analysis Lecture 6: Pricing and Hedging Financial Derivatives M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October
More informationEstimating the Parameters of Closed Skew-Normal Distribution Under LINEX Loss Function
Australian Journal of Basic Applied Sciences, 5(7): 92-98, 2011 ISSN 1991-8178 Estimating the Parameters of Closed Skew-Normal Distribution Under LINEX Loss Function 1 N. Abbasi, 1 N. Saffari, 2 M. Salehi
More informationLogarithmic derivatives of densities for jump processes
Logarithmic derivatives of densities for jump processes Atsushi AKEUCHI Osaka City University (JAPAN) June 3, 29 City University of Hong Kong Workshop on Stochastic Analysis and Finance (June 29 - July
More information1 Bayesian Bias Correction Model
1 Bayesian Bias Correction Model Assuming that n iid samples {X 1,...,X n }, were collected from a normal population with mean µ and variance σ 2. The model likelihood has the form, P( X µ, σ 2, T n >
More informationWeek 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals
Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg :
More informationLecture IV Portfolio management: Efficient portfolios. Introduction to Finance Mathematics Fall Financial mathematics
Lecture IV Portfolio management: Efficient portfolios. Introduction to Finance Mathematics Fall 2014 Reduce the risk, one asset Let us warm up by doing an exercise. We consider an investment with σ 1 =
More informationRandom Variables Handout. Xavier Vilà
Random Variables Handout Xavier Vilà Course 2004-2005 1 Discrete Random Variables. 1.1 Introduction 1.1.1 Definition of Random Variable A random variable X is a function that maps each possible outcome
More information6. Continous Distributions
6. Continous Distributions Chris Piech and Mehran Sahami May 17 So far, all random variables we have seen have been discrete. In all the cases we have seen in CS19 this meant that our RVs could only take
More information1 Residual life for gamma and Weibull distributions
Supplement to Tail Estimation for Window Censored Processes Residual life for gamma and Weibull distributions. Gamma distribution Let Γ(k, x = x yk e y dy be the upper incomplete gamma function, and let
More informationStat 6863-Handout 1 Economics of Insurance and Risk June 2008, Maurice A. Geraghty
A. The Psychology of Risk Aversion Stat 6863-Handout 1 Economics of Insurance and Risk June 2008, Maurice A. Geraghty Suppose a decision maker has an asset worth $100,000 that has a 1% chance of being
More information12. Conditional heteroscedastic models (ARCH) MA6622, Ernesto Mordecki, CityU, HK, 2006.
12. Conditional heteroscedastic models (ARCH) MA6622, Ernesto Mordecki, CityU, HK, 2006. References for this Lecture: Robert F. Engle. Autoregressive Conditional Heteroscedasticity with Estimates of Variance
More informationWhat was in the last lecture?
What was in the last lecture? Normal distribution A continuous rv with bell-shaped density curve The pdf is given by f(x) = 1 2πσ e (x µ)2 2σ 2, < x < If X N(µ, σ 2 ), E(X) = µ and V (X) = σ 2 Standard
More informationIEOR E4703: Monte-Carlo Simulation
IEOR E4703: Monte-Carlo Simulation Generating Random Variables and Stochastic Processes Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com
More informationPre-Algebra, Unit 7: Percents Notes
Pre-Algebra, Unit 7: Percents Notes Percents are special fractions whose denominators are 100. The number in front of the percent symbol (%) is the numerator. The denominator is not written, but understood
More informationPractical Hedging: From Theory to Practice. OSU Financial Mathematics Seminar May 5, 2008
Practical Hedging: From Theory to Practice OSU Financial Mathematics Seminar May 5, 008 Background Dynamic replication is a risk management technique used to mitigate market risk We hope to spend a certain
More informationCentral limit theorems
Chapter 6 Central limit theorems 6.1 Overview Recall that a random variable Z is said to have a standard normal distribution, denoted by N(0, 1), if it has a continuous distribution with density φ(z) =
More informationMath-Stat-491-Fall2014-Notes-V
Math-Stat-491-Fall2014-Notes-V Hariharan Narayanan December 7, 2014 Martingales 1 Introduction Martingales were originally introduced into probability theory as a model for fair betting games. Essentially
More informationObjective Bayesian Analysis for Heteroscedastic Regression
Analysis for Heteroscedastic Regression & Esther Salazar Universidade Federal do Rio de Janeiro Colóquio Inter-institucional: Modelos Estocásticos e Aplicações 2009 Collaborators: Marco Ferreira and Thais
More informationStochastic Volatility (SV) Models
1 Motivations Stochastic Volatility (SV) Models Jun Yu Some stylised facts about financial asset return distributions: 1. Distribution is leptokurtic 2. Volatility clustering 3. Volatility responds to
More informationHomework Assignments
Homework Assignments Week 1 (p. 57) #4.1, 4., 4.3 Week (pp 58 6) #4.5, 4.6, 4.8(a), 4.13, 4.0, 4.6(b), 4.8, 4.31, 4.34 Week 3 (pp 15 19) #1.9, 1.1, 1.13, 1.15, 1.18 (pp 9 31) #.,.6,.9 Week 4 (pp 36 37)
More informationChapter 5. Statistical inference for Parametric Models
Chapter 5. Statistical inference for Parametric Models Outline Overview Parameter estimation Method of moments How good are method of moments estimates? Interval estimation Statistical Inference for Parametric
More informationProbabilistic Meshless Methods for Bayesian Inverse Problems. Jon Cockayne July 8, 2016
Probabilistic Meshless Methods for Bayesian Inverse Problems Jon Cockayne July 8, 2016 1 Co-Authors Chris Oates Tim Sullivan Mark Girolami 2 What is PN? Many problems in mathematics have no analytical
More informationGreek Maxima 1 by Michael B. Miller
Greek Maxima by Michael B. Miller When managing the risk of options it is often useful to know how sensitivities will change over time and with the price of the underlying. For example, many people know
More informationPoint Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage
6 Point Estimation Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage Point Estimation Statistical inference: directed toward conclusions about one or more parameters. We will use the generic
More informationThe Normal Distribution
The Normal Distribution The normal distribution plays a central role in probability theory and in statistics. It is often used as a model for the distribution of continuous random variables. Like all models,
More informationNon replication of options
Non replication of options Christos Kountzakis, Ioannis A Polyrakis and Foivos Xanthos June 30, 2008 Abstract In this paper we study the scarcity of replication of options in the two period model of financial
More informationOption Pricing. Chapter Discrete Time
Chapter 7 Option Pricing 7.1 Discrete Time In the next section we will discuss the Black Scholes formula. To prepare for that, we will consider the much simpler problem of pricing options when there are
More informationDecision theoretic estimation of the ratio of variances in a bivariate normal distribution 1
Decision theoretic estimation of the ratio of variances in a bivariate normal distribution 1 George Iliopoulos Department of Mathematics University of Patras 26500 Rio, Patras, Greece Abstract In this
More informationAn Improved Skewness Measure
An Improved Skewness Measure Richard A. Groeneveld Professor Emeritus, Department of Statistics Iowa State University ragroeneveld@valley.net Glen Meeden School of Statistics University of Minnesota Minneapolis,
More informationMetropolis-Hastings algorithm
Metropolis-Hastings algorithm Dr. Jarad Niemi STAT 544 - Iowa State University March 27, 2018 Jarad Niemi (STAT544@ISU) Metropolis-Hastings March 27, 2018 1 / 32 Outline Metropolis-Hastings algorithm Independence
More informationFinancial Time Series and Their Characterictics
Financial Time Series and Their Characterictics Mei-Yuan Chen Department of Finance National Chung Hsing University Feb. 22, 2013 Contents 1 Introduction 1 1.1 Asset Returns..............................
More information4: Single Cash Flows and Equivalence
4.1 Single Cash Flows and Equivalence Basic Concepts 28 4: Single Cash Flows and Equivalence This chapter explains basic concepts of project economics by examining single cash flows. This means that each
More informationModelling Environmental Extremes
19th TIES Conference, Kelowna, British Columbia 8th June 2008 Topics for the day 1. Classical models and threshold models 2. Dependence and non stationarity 3. R session: weather extremes 4. Multivariate
More information