This is a open-book exam. Assigned: Friday November 27th 2009 at 16:00. Due: Monday November 30th 2009 before 10:00.

Similar documents
Conjugate Models. Patrick Lam

Non-informative Priors Multiparameter Models

Outline. Review Continuation of exercises from last time

Bayesian course - problem set 3 (lecture 4)

INSTITUTE AND FACULTY OF ACTUARIES. Curriculum 2019 SPECIMEN EXAMINATION

What was in the last lecture?

Chapter 7: Estimation Sections

Commonly Used Distributions

SOCIETY OF ACTUARIES EXAM STAM SHORT-TERM ACTUARIAL MATHEMATICS EXAM STAM SAMPLE QUESTIONS

Continuous Distributions

Describing Uncertain Variables

Chapter 7: Estimation Sections

continuous rv Note for a legitimate pdf, we have f (x) 0 and f (x)dx = 1. For a continuous rv, P(X = c) = c f (x)dx = 0, hence

**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:

ST440/550: Applied Bayesian Analysis. (5) Multi-parameter models - Summarizing the posterior

1 Bayesian Bias Correction Model

Lecture 23. STAT 225 Introduction to Probability Models April 4, Whitney Huang Purdue University. Normal approximation to Binomial

Chapter 3 Statistical Quality Control, 7th Edition by Douglas C. Montgomery. Copyright (c) 2013 John Wiley & Sons, Inc.

Subject CS1 Actuarial Statistics 1 Core Principles. Syllabus. for the 2019 exams. 1 June 2018

4-2 Probability Distributions and Probability Density Functions. Figure 4-2 Probability determined from the area under f(x).

MATH 3200 Exam 3 Dr. Syring

Bayesian Normal Stuff

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

CS340 Machine learning Bayesian statistics 3

Lecture 5: Fundamentals of Statistical Analysis and Distributions Derived from Normal Distributions

Model 0: We start with a linear regression model: log Y t = β 0 + β 1 (t 1980) + ε, with ε N(0,

Homework Problems Stat 479

Exam 2 Spring 2015 Statistics for Applications 4/9/2015

Practice Exam 1. Loss Amount Number of Losses

Confidence Intervals Introduction

1. You are given the following information about a stationary AR(2) model:

Chapter 3 Common Families of Distributions. Definition 3.4.1: A family of pmfs or pdfs is called exponential family if it can be expressed as

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER

Actuarial Society of India EXAMINATIONS

Econometric Methods for Valuation Analysis

CS 361: Probability & Statistics

Objective Bayesian Analysis for Heteroscedastic Regression

Normal Distribution. Notes. Normal Distribution. Standard Normal. Sums of Normal Random Variables. Normal. approximation of Binomial.

Appendix A. Selecting and Using Probability Distributions. In this appendix

# generate data num.obs <- 100 y <- rnorm(num.obs,mean = theta.true, sd = sqrt(sigma.sq.true))

Applied Statistics I

UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions.

Exam M Fall 2005 PRELIMINARY ANSWER KEY

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.

Chapter 5: Statistical Inference (in General)

Data Analytics (CS40003) Practice Set IV (Topic: Probability and Sampling Distribution)

Posterior Inference. , where should we start? Consider the following computational procedure: 1. draw samples. 2. convert. 3. compute properties

Confidence Intervals for an Exponential Lifetime Percentile

Continuous random variables

MVE051/MSG Lecture 7

Chapter 7 - Lecture 1 General concepts and criteria

Intro. Econometrics Fall 2015

Chapter 7: Point Estimation and Sampling Distributions

A potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in different subsamples

SYSM 6304 Risk and Decision Analysis Lecture 2: Fitting Distributions to Data

Review for Final Exam Spring 2014 Jeremy Orloff and Jonathan Bloom

Common one-parameter models

Chapter 4 Probability Distributions

EE641 Digital Image Processing II: Purdue University VISE - October 29,

Statistical estimation

Chapter 7: Estimation Sections

Back to estimators...

ROM SIMULATION Exact Moment Simulation using Random Orthogonal Matrices

1. Statistical problems - a) Distribution is known. b) Distribution is unknown.

Confidence Intervals for the Difference Between Two Means with Tolerance Probability

value BE.104 Spring Biostatistics: Distribution and the Mean J. L. Sherley

5.3 Interval Estimation

Lecture 12. Some Useful Continuous Distributions. The most important continuous probability distribution in entire field of statistics.

Mathematics of Finance Final Preparation December 19. To be thoroughly prepared for the final exam, you should

CS 237: Probability in Computing

Point Estimation. Principle of Unbiased Estimation. When choosing among several different estimators of θ, select one that is unbiased.

Construction and behavior of Multinomial Markov random field models

ECON 214 Elements of Statistics for Economists 2016/2017

Random Variables Handout. Xavier Vilà

Chapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi

Module 2 caa-global.org

Distribution of state of nature: Main problem

Rules and Models 1 investigates the internal measurement approach for operational risk capital

Machine Learning for Quantitative Finance

Chapter 4 Random Variables & Probability. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

2011 Pearson Education, Inc

M.Sc. ACTUARIAL SCIENCE. Term-End Examination

Probability Theory and Simulation Methods. April 9th, Lecture 20: Special distributions

DYNAMIC ECONOMETRIC MODELS Vol. 8 Nicolaus Copernicus University Toruń Mateusz Pipień Cracow University of Economics

Basic Procedure for Histograms

PhD Qualifier Examination

Stochastic Loss Reserving with Bayesian MCMC Models Revised March 31

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -5 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc.

Sampling Distribution

1. Covariance between two variables X and Y is denoted by Cov(X, Y) and defined by. Cov(X, Y ) = E(X E(X))(Y E(Y ))

Overview. Definitions. Definitions. Graphs. Chapter 4 Probability Distributions. probability distributions

SOCIETY OF ACTUARIES/CASUALTY ACTUARIAL SOCIETY EXAM C CONSTRUCTION AND EVALUATION OF ACTUARIAL MODELS EXAM C SAMPLE QUESTIONS

Earnings Inequality and the Minimum Wage: Evidence from Brazil

Lecture 10: Point Estimation

Lecture 3: Probability Distributions (cont d)

Final Exam Suggested Solutions

Dealing with forecast uncertainty in inventory models

3 ˆθ B = X 1 + X 2 + X 3. 7 a) Find the Bias, Variance and MSE of each estimator. Which estimator is the best according

1. For two independent lives now age 30 and 34, you are given:

Further Application of Confidence Limits to Quantile Measures for the Lognormal Distribution using the MATLAB Program

Transcription:

University of Iceland School of Engineering and Sciences Department of Industrial Engineering, Mechanical Engineering and Computer Science IÐN106F Industrial Statistics II - Bayesian Data Analysis Fall 2009 Take-home Exam This is a open-book exam. Assigned: Friday November 27th 2009 at 16:00. Due: Monday November 30th 2009 before 10:00. There is a total of five problems each problem weighing 20%. Within a problem each scenario weighs the same. Note that the number of scenarios is not the same between problems. It is assumed that students have access to Matlab or R or S+ but of course other computer programs can be used. Programs written for Matlab, R, S+ or other software should be indicated in the appendix. 1

1. (20%) The number of live births in Iceland from 1991 to 2008 can be found in the file birth_vs_total.txt (column 2) along with year (column 1) and total number of Icelanders ( 10 3 ) (column 3). Denote the number of births by y t, t = 1,..., T, T = 18, where t = 1 denotes the year 1991 and t = 18 denotes the year 2008. The total number of Icelanders is denoted by x t, t = 1,..., T. Assume a Poisson model for live births in Iceland of the form y i Poisson(x i θ), t = 1,..., T where θ is an unknown parameter representing the rate of birth per thousand Icelanders. Let p(θ) denote a noninformative prior for θ. Here a gamma distribution is assumed for θ, that is, p(θ) = Gamma(θ α = 0.001, β = 0.001). (a) Evaluate the posterior distribution of θ. (b) Evaluate the adequacy of the model by computing a Bayesian p-value for the following discrepancy measure T(y, θ) = T(y) = 1 T T 2 (y t x tˆθ) t=1 where T t=1 ˆθ = y t T t=1 x. t Is the proposed Poisson model adequate? If not, discuss what extensions could be made. 2

2. (20%) The data set in the file precip_rvk_1_2.txt contains measurements on the annual maximum daily precipitation (mm per 24 hours, from 9:00 AM to 9:00 AM the next day) recorded at two nearby station in Reykjavík over the years 1926 to 1985. (a) Plot a normal probability plot of the logarithm of the data for each station and evaluate the lognormal assumption for the data. (b) Assume the data from each station follow a lognormal distribution, that is, the logarithm of the data follow a normal distribution with mean µ i and variance σ 2 i, i = 1, 2. Use the noninformative prior p(µ i, σi 2) σ 2 i, i = 1, 2, and draw a sample of size L = 10 4 after burn-in within each of four chains from the joint posterior distribution of µ i and σi 2, i = 1, 2, by using the Gibbs sampler. Based on this sample compute the posterior mean, standard deviation, 2.5%, 25%, 50%, 75% and 97.5% percentiles for µ i and σ i, i = 1, 2. (c) Assume the data from both stations follow the same lognormal distribution with mean µ 0 and variance σ 2 0. Use the noninformative prior p(µ 0, σ 2 0 ) σ 2 0, and draw a sample of size L = 10 4 after burn-in within each of four chains from the joint posterior distribution of µ 0 and σ 2 0 by using the Gibbs sampler. Based on this sample compute the posterior mean, standard deviation, 2.5%, 25%, 50%, 75% and 97.5% percentiles for µ 0 and σ 0. (d) Assume the data from each station follow a lognormal distribution with mean µ i, i = 1, 2, and joint variance σ 2 0. Use the noninformative prior p(µ 1, µ 2, σ 2 0) σ 2 0 and draw a sample of size L = 10 4 after burn-in within each of four chains from the joint posterior distribution of µ i, i = 1, 2, and σ0 2 by using the Gibbs sampler. Based on this sample compute the posterior mean, standard deviation, 2.5%, 25%, 50%, 75% and 97.5% percentiles for µ i, i = 1, 2, and σ 0. (e) Compute DIC for the three models introduced in (b), (c) and (d). Which of these three models should be preferred according to DIC? 3

3. (20%) The median values of owner-occupied homes were collected for each of 506 areas of Boston along with thirteen other variables for the purpose of predicting housing values in other areas of Boston. The dependent variable is y i = log(the median value of owner-occupied homes in area i (in $1000)). The explanatory variables are x i,2 = per capita crime rate by town in area i x i,3 = percentage of residential land zoned for lots over 25,000 sq.ft. in area i x i,4 = percentage of non-retail business acres per town in area i x i,5 = Charles River dummy variable in area i (= 1 if tract bounds river; 0 otherwise) x i,6 = nitric oxides concentration in area i (parts per 10 million) x i,7 = average number of rooms per dwelling in area i x i,8 = percentage of owner-occupied units built prior to 1940 in area i x i,9 = weighted distances to five Boston employment centres in area i x i,10 = index of accessibility to radial highways in area i x i,11 = full-value property-tax rate per $10,000 in area i x i,12 = pupil-teacher ratio by town in area i x i,13 = 1000(B 0.63) 2 where B is the proportion of blacks by town in area i x i,14 = percentage of the population with lower status in area i x i,15 = (x i,3 x 3 ) 2 x i,16 = (x i,14 x 14 ) 2 where x 3 and x 14 are the sample means of x 3 and x 14, respectively. 4

The file boston_housing.data contains this data set with columns (x 2, x 3,..., x 14, z). Note that for this analysis the logarithm of z is needed (y = log(z)) and the variables x 15 and x 16 need to be created from x 3 and x 14. The following linear model is proposed 16 E(y i β, σ 2, X) = β j x ij, for i = 1,..., 506, with x i1 = 1 for all i. We further assume that the y s are independent, normally distributed and have equal variance, that is var(y i β, σ 2, X) = σ 2, for all i. j=1 (a) Plot y versus x 2, x 3,..., x 16, a total of 15 figures. Which explanatory variables show a clear relationship with y, and which don t? (b) Create a table with the 15 best models according to DIC where the table has columns showing DIC, the total number of parameters, and which explanatory variables are in the model. Based on the table and your knowledge of the problem, select one model for these data. This model will be used below. Use the Matlab routine dic_normal_models.m to compute all possible models, see course s web-page. To support your decision compute point estimates (posterior mean) and 95% marginal posterior intervals for the parameters β and σ in the full model, that is, the model which uses all the variables. Use L = 10000 (c) Draw a normal probability plot of the standardized residuals. Do the standardized residuals appear to follow a normal distribution? Draw the standardized residuals versus the predicted y, that is X ˆβ, and also versus all of the fifteen explanatory variables. Does the variance appear to be fixed when plotted against these variables? (d) Compute point estimates (posterior mean) and 95% marginal posterior intervals for the parameters in the final model selected in (b), that is, β and σ. Use L = 10000. 5

(e) Interpret the parameters in the model, that is, explain the effect of each explanatory variable on the median value of owner-occupied homes. Take into account the log-transformation (what is the expected value of z = exp(y) and how does it change when x i is increased by one unit?). (f) Compute a prediction and a 95% posterior predictive interval for an area in Boston that has the following explanatory variables; x 2 = 4.3, x 3 = 41, x 4 = 8.4, x 5 = 0, x 6 = 0.71, x 7 = 6.7, x 8 = 39, x 9 = 2.1, x 10 = 7, x 11 = 383, x 12 = 17.8, x 13 = 350, x 14 = 15. Take into account the log-transformation. 6

4. (20%) The sum of exponentially distributed random variables follows the Erlang distribution. Given that v j follows an exponential distribution with mean 1/θ, θ > 0, j = 1,..., r, then y = r j=1 follows an Erlang distribution with parameters r and θ. The parameter r is an integer which is usually known while θ is usually unknown. The Erlang distribution is a special case of the gamma distribution with α = r and β = θ. v j The density of a random variable y that follows the Erlang distribution is given by p(y r, θ) = θr y r 1 e θy, y > 0 (r 1)! and the mean and variance are E(y) = r/θ and var(y) = r/θ 2, respectively. (a) Assume n independent observations, y i, i = 1,..., n, follow an Erlang distribution with r = 20 and some unknown θ. Assume that the prior distribution is a gamma distribution with parameters α = 0.001 and β = 0.001, that is p(θ) = Gamma(θ α = 0.001, β = 0.001) Find the general form of the posterior distribution of θ. (b) Evaluate the numerical values of the parameters of the posterior distribution using the data set in erlang.dat which contains n = 100 observed values. (c) Assume that it is not known whether r is equal to 18, 19, 20, 21 or 22. Further, assume that Compute the probability Pr(r = s) = 1, s = 18, 19, 20, 21, 22. 5 Pr(r = s y), 7

for s = 18, 19, 20, 21, 22, where y is a vector containing all the data. Which of these five values is most likely? (Hint: Pr(r = s y) can be evaluated analytically via integration of a gamma density. When computing Pr(r = s y), it is better to take the logarithm first, compute its value and then take the exponent.) 8

5. (20%) The binomial distribution is usually parameterized with n and θ and the sampling distribution of y is given by p(y n, θ) = Bin(y n, θ) = ( ) n θ y (1 θ) n y, y y {0, 1,..., n}. The binimial distribution can also be parameterized with κ = log(θ) log(1 θ), and θ and (1 θ) as function of κ are θ = eκ 1 1 + eκ, and 1 θ = 1 + e κ. In that case the sampling distribution of y is given by p(y n, κ) = Bin(y n, κ) = ( ) ( ) n e κ y ( ) n y 1 = y 1 + e κ 1 + e κ ( ) n e κy y (1 + e κ ) n. If the uniform distribution is assumed as a prior distribution for θ then the posterior distribution of θ is a beta distribution with α = y + 1 and β = n y + 1. If θ is transformed to κ then the posterior density of κ is given by p(κ y) = (n + 1)! y!(n y)! e κ(y+1) (1 + e κ ) (n+2). (a) Find the normal approximation to the posterior density of κ. (b) Draw the exact posterior density of κ and the approximated normal posterior density of κ on the same graph when n = 30 and y = 21. 9