Bayesian Normal Stuff

Similar documents
Non-informative Priors Multiparameter Models

Conjugate Models. Patrick Lam

Posterior Inference. , where should we start? Consider the following computational procedure: 1. draw samples. 2. convert. 3. compute properties

STAT 425: Introduction to Bayesian Analysis

Chapter 7: Point Estimation and Sampling Distributions

Chapter 8: Sampling distributions of estimators Sections

Bayesian Hierarchical/ Multilevel and Latent-Variable (Random-Effects) Modeling

Chapter 5: Statistical Inference (in General)

Bayesian Linear Model: Gory Details

Chapter 7: Estimation Sections

Confidence Intervals Introduction

Chapter 7: Estimation Sections

# generate data num.obs <- 100 y <- rnorm(num.obs,mean = theta.true, sd = sqrt(sigma.sq.true))

Mixture Models and Gibbs Sampling

ST440/550: Applied Bayesian Analysis. (5) Multi-parameter models - Summarizing the posterior

(5) Multi-parameter models - Summarizing the posterior

Model 0: We start with a linear regression model: log Y t = β 0 + β 1 (t 1980) + ε, with ε N(0,

Exam 2 Spring 2015 Statistics for Applications 4/9/2015

STAT 509: Statistics for Engineers Dr. Dewei Wang. Copyright 2014 John Wiley & Sons, Inc. All rights reserved.

1 Bayesian Bias Correction Model

Outline. Review Continuation of exercises from last time

Modeling skewness and kurtosis in Stochastic Volatility Models

Review for Final Exam Spring 2014 Jeremy Orloff and Jonathan Bloom

6 Central Limit Theorem. (Chs 6.4, 6.5)

Chapter 7: Estimation Sections

INSTITUTE AND FACULTY OF ACTUARIES. Curriculum 2019 SPECIMEN EXAMINATION

Week 1 Quantitative Analysis of Financial Markets Distributions B

This is a open-book exam. Assigned: Friday November 27th 2009 at 16:00. Due: Monday November 30th 2009 before 10:00.

Sampling & Confidence Intervals

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS

1. Covariance between two variables X and Y is denoted by Cov(X, Y) and defined by. Cov(X, Y ) = E(X E(X))(Y E(Y ))

درس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی

An Introduction to Bayesian Inference and MCMC Methods for Capture-Recapture

Chapter 8. Introduction to Statistical Inference

Microeconomic Theory II Preliminary Examination Solutions

Kernel Conditional Quantile Estimation via Reduction Revisited

Chapter 7. Sampling Distributions and the Central Limit Theorem

Unit 5: Sampling Distributions of Statistics

Unit 5: Sampling Distributions of Statistics

Confidence Intervals for the Difference Between Two Means with Tolerance Probability

Conjugate priors: Beta and normal Class 15, Jeremy Orloff and Jonathan Bloom

Data Analysis and Statistical Methods Statistics 651

Statistics and Probability

Signaling Games. Farhad Ghassemi

Business Statistics 41000: Probability 4

Part II: Computation for Bayesian Analyses

Lecture Notes 6. Assume F belongs to a family of distributions, (e.g. F is Normal), indexed by some parameter θ.

CS340 Machine learning Bayesian statistics 3

Probability Theory and Simulation Methods. April 9th, Lecture 20: Special distributions

Beating the market, using linear regression to outperform the market average

START HERE: Instructions. 1 Exponential Family [Zhou, Manzil]

Interval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems

19. CONFIDENCE INTERVALS FOR THE MEAN; KNOWN VARIANCE

Review: Population, sample, and sampling distributions

STA215 Confidence Intervals for Proportions

Chapter 9: Sampling Distributions

Subject CS1 Actuarial Statistics 1 Core Principles. Syllabus. for the 2019 exams. 1 June 2018

Section The Sampling Distribution of a Sample Mean

Bayesian course - problem set 3 (lecture 4)

Statistical Intervals. Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Chapter 8 Statistical Intervals for a Single Sample

1 Introduction 1. 3 Confidence interval for proportion p 6

CS 361: Probability & Statistics

The Central Limit Theorem. Sec. 8.2: The Random Variable. it s Distribution. it s Distribution

Machine Learning for Quantitative Finance

5.3 Interval Estimation

Business Statistics 41000: Probability 3

Extracting Information from the Markets: A Bayesian Approach

Common one-parameter models

Math 140 Introductory Statistics

Chapter 5. Statistical inference for Parametric Models

Regret-based Selection

GOV 2001/ 1002/ E-200 Section 3 Inference and Likelihood

Contents. 1 Introduction. Math 321 Chapter 5 Confidence Intervals. 1 Introduction 1

THE UNIVERSITY OF TEXAS AT AUSTIN Department of Information, Risk, and Operations Management

Applied Statistics I

TABLE OF CONTENTS - VOLUME 2

Chapter 7 - Lecture 1 General concepts and criteria

Sampling and sampling distribution

Estimation after Model Selection

Sampling. Marc H. Mehlman University of New Haven. Marc Mehlman (University of New Haven) Sampling 1 / 20.

Robust Regression for Capital Asset Pricing Model Using Bayesian Approach

Module 4: Probability

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Tutorial 6. Sampling Distribution. ENGG2450A Tutors. 27 February The Chinese University of Hong Kong 1/6

Bivariate Birnbaum-Saunders Distribution

Metropolis-Hastings algorithm

continuous rv Note for a legitimate pdf, we have f (x) 0 and f (x)dx = 1. For a continuous rv, P(X = c) = c f (x)dx = 0, hence

Chapter 14 : Statistical Inference 1. Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same.

Dealing with forecast uncertainty in inventory models

Statistics 13 Elementary Statistics

STATISTICS and PROBABILITY

Experimental Design and Statistics - AGA47A

Review for Quiz #2 Revised: October 31, 2015

The Normal Distribution

STA2601. Tutorial letter 105/2/2018. Applied Statistics II. Semester 2. Department of Statistics STA2601/105/2/2018 TRIAL EXAMINATION PAPER

Gov 2001: Section 5. I. A Normal Example II. Uncertainty. Gov Spring 2010

Conjugate Bayesian Models for Massive Spatial Data

Chapter 7 Sampling Distributions and Point Estimation of Parameters

Transcription:

Bayesian Normal Stuff - Set-up of the basic model of a normally distributed random variable with unknown mean and variance (a two-parameter model). - Discuss philosophies of prior selection - Implementation of different priors with a discussion of MCMC methods.

Introduction to Applied Bayesian Modeling ICPSR 2008 Day 8

The normal model with unknown mean and variance Let s extend the normal model to the case where the variance parameter is assumed to be unknown. Thus, y i ~ N(μ, σ 2 ), where μ and σ 2 are both unknown random variables. The Bayesian set-up should still look familiar: p(μ, σ 2 y) p(μ, σ 2 ) p(y μ, σ 2 ). Note: we would like to make inferences about the marginal distributions p(μ y) and p(σ 2 y) rather than the conditional distribution p(μ, σ 2 y). Ultimately, we d like to find: p(μ y) = p(μ σ 2, y) p(σ 2 y) dσ 2 What should we choose for the prior distribution p(μ, σ 2 )?

Different types of Bayesians choose different priors Classical Bayesians: the prior is a necessary evil. choose priors that interject the least information possible. Modern Parametric Bayesians: the prior is a useful convenience. choose prior distributions with desirable properties (e.g. conjugacy). Given a distributional choice, prior parameters are chosen to interject the least information possible. Subjective Bayesians: the prior is a summary of old beliefs choose prior distributions based on previous knowledge either the results of earlier studies or nonscientific opinion.

The Classical Bayesian and the normal model with unknown mean and variance y ~ N(μ, σ 2 ) where μ and σ 2 are both unknown random variables. What prior distribution would we choose to represent the absence of any knowledge in this instance? What if we assumed that the two parameters were independent, so p(μ, σ 2 ) = p(μ)p(σ 2 )?

Modern Parametric Bayesians and the normal model with unknown mean and variance y ~ N(μ, σ 2 ) where μ and σ 2 are both unknown random variables. What prior distribution would a modern parametric Bayesian choose to satisfy the demands of convenience? What if we used the definition of conditional probability, so p(μ, σ 2 ) = p(μ σ 2 )p(σ 2 )?

Modern Parametric Bayesians and the normal model with unknown mean and variance y ~ N(μ, σ 2 ) where μ and σ 2 are both unknown random variables. A modern parameteric Bayesian would typically choose a conjugate prior. For the normal model with unknown mean and variance, the conjugate prior for the joint distribution of μ and σ 2 is the normal inversegamma (Γ) distribution (i.e. normal-inverse-χ 2 ) p( μ, σ 2 ) ~ N-Inv-χ 2 (μ 0, σ 02 /k 0 ; v 0,σ 02 ) Four Parameters in the prior

Suppose p(μ, σ 2 ) ~ N-Inv-χ 2 (μ 0, σ 02 /k 0 ; v 0, σ 02 ) ICBST the above expression can be factored such that: p(μ,σ2) = p(μ σ 2 )p(σ 2 ) where μ σ 2 ~ N(μ 0, σ 2 /k 0 ) and σ 2 ~ Inv-χ 2 (v 0,σ 02 ) Because this is a conjugate distribution for the normal distribution with unknown mean and variance, the posterior distribution will also be normal-inv-χ 2.

Lazy Modern Parametric Bayesians and the normal model with unknown mean and variance Suppose that y ~ N(μ, τ) where τ was the prior precision. From here on when we talk about the normal distribution you should expect that we will speak in terms of the precision τ rather than the variance σ 2. This is because WinBugs is programmed to use τ rather than σ 2 Suppose also that you don t want to think too hard about the prior joint distribution of μ and τ, and assume that: p(μ, τ) = p(μ)p(τ) What distributions would you choose for p(μ) and p(τ)?

Suppose that y ~ N(μ, τ) What priors would you choose for μ and τ? I would choose: μ ~ N( 0, t ) (where t was a large number) This is because, if we expect something like the central limit theorem to hold, then the distribution of the sample mean should be approximately normal for large n. Gamma τ ~ Γ( a, b ) (where a, b are small numbers) This is because this distribution is bounded below at zero and unlike the χ 2 distribution which shares this property it is not constrained to have an equal mean and variance. Note how we now have to talk about the mean of the distribution of the variance.

model { for (i in 1:N) { y[i] ~ dnorm( mu, tau) } mu ~ dnorm(0,.001) tau ~ dgamma(.01,.001) } }

The Subjective Bayesian and the normal model with unknown mean and variance The subjective Bayesian framework provides little guidance about what prior distribution that one should choose. In a sense, that is the point of the subjective approach it is subjective. You are free to pick whatever prior distribution you want: multi-modal, uniform, high or low variance, skewed, constrained to lie between a certain set of values, etc. One of the key difficulties is that the prior distributions probably are not independent (i.e. p(θ 1, θ 2 ) p(θ 1 )p(θ 2 )). For example, regression coefficients are generally not independent, even if that isn t transparent in your STATA output. If you want to incorporate subjective beliefs, this non-independence should be taken into account.

Some General Guidelines I recommend the following general guidelines: 1) if possible, use standard distributions (i.e. conjugate or semi-conjugate) and choose parameters that fix the mean, variance, kurtosis, etc. to be some desirable level. 2) sample from the prior predictive distribution and check to see if your results make sense. - Mechanically, perform the following steps: i) take a random draw θ` from the joint prior distribution of θ. ii) take a random draw Y from the pdf Y θ with θ = θ` iii) repeat steps i and ii several thousand times to provide a sample that you can use to summarize the prior predictive distribution. iv) generate various summaries of the prior predictive distribution and check to see if the model s predictions are consistent with your beliefs about the data-generating process.