Bayesian Multinomial Model for Ordinal Data

Similar documents
Bayesian Hierarchical Modeling for Meta- Analysis

SAS/STAT 15.1 User s Guide The FMM Procedure

SAS/STAT 14.1 User s Guide. The HPFMM Procedure

Estimation Appendix to Dynamics of Fiscal Financing in the United States

Outline. Review Continuation of exercises from last time

Appendix. A.1 Independent Random Effects (Baseline)

Probits. Catalina Stefanescu, Vance W. Berger Scott Hershberger. Abstract

Computational Statistics Handbook with MATLAB

Model 0: We start with a linear regression model: log Y t = β 0 + β 1 (t 1980) + ε, with ε N(0,

Getting started with WinBUGS

Calibration of Interest Rates

Subject CS1 Actuarial Statistics 1 Core Principles. Syllabus. for the 2019 exams. 1 June 2018

Option Pricing Using Bayesian Neural Networks

An Introduction to Bayesian Inference and MCMC Methods for Capture-Recapture

Part II: Computation for Bayesian Analyses

Maximum Likelihood Estimation

Lecture 21: Logit Models for Multinomial Responses Continued

PROBABILITY. Wiley. With Applications and R ROBERT P. DOBROW. Department of Mathematics. Carleton College Northfield, MN

M.Sc. ACTUARIAL SCIENCE. Term-End Examination

ELEMENTS OF MONTE CARLO SIMULATION

**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:

1. You are given the following information about a stationary AR(2) model:

Technical Appendix: Policy Uncertainty and Aggregate Fluctuations.

Evidence from Large Indemnity and Medical Triangles

Iranian Journal of Economic Studies. Inflation Behavior in Top Sukuk Issuing Countries : Using a Bayesian Log-linear Model

Robust Regression for Capital Asset Pricing Model Using Bayesian Approach

Institute of Actuaries of India Subject CT6 Statistical Methods

STA 4504/5503 Sample questions for exam True-False questions.

To be two or not be two, that is a LOGISTIC question

# generate data num.obs <- 100 y <- rnorm(num.obs,mean = theta.true, sd = sqrt(sigma.sq.true))

A Comparison of Univariate Probit and Logit. Models Using Simulation

Using New SAS 9.4 Features for Cumulative Logit Models with Partial Proportional Odds Paul J. Hilliard, Educational Testing Service (ETS)

Web Appendix. Are the effects of monetary policy shocks big or small? Olivier Coibion

Income inequality and the growth of redistributive spending in the U.S. states: Is there a link?

Market Risk Analysis Volume II. Practical Financial Econometrics

Bayesian course - problem set 3 (lecture 4)

CHAPTER 8 EXAMPLES: MIXTURE MODELING WITH LONGITUDINAL DATA

A First Course in Probability

A Comprehensive, Non-Aggregated, Stochastic Approach to. Loss Development

List of Examples. Chapter 1

COS 513: Gibbs Sampling

A Brand Choice Model Using Multinomial Logistics Regression, Bayesian Inference and Markov Chain Monte Carlo Method

This homework assignment uses the material on pages ( A moving average ).

MCMC Package Example

A Skewed Truncated Cauchy Logistic. Distribution and its Moments

Extracting Information from the Markets: A Bayesian Approach

WC-5 Just How Credible Is That Employer? Exploring GLMs and Multilevel Modeling for NCCI s Excess Loss Factor Methodology

Hierarchical Generalized Linear Models. Measurement Incorporated Hierarchical Linear Models Workshop

Market Risk Analysis Volume I

Evidence from Large Workers

Posterior Inference. , where should we start? Consider the following computational procedure: 1. draw samples. 2. convert. 3. compute properties

Maximum Likelihood Estimation

XLSTAT TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING

Analysis of extreme values with random location Abstract Keywords: 1. Introduction and Model

9. Logit and Probit Models For Dichotomous Data

Stochastic Claims Reserving _ Methods in Insurance

Properties of the estimated five-factor model

MCMC Package Example (Version 0.5-1)

An Improved Saddlepoint Approximation Based on the Negative Binomial Distribution for the General Birth Process

Oil Price Volatility and Asymmetric Leverage Effects

CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES

Estimation Procedure for Parametric Survival Distribution Without Covariates

Contents Part I Descriptive Statistics 1 Introduction and Framework Population, Sample, and Observations Variables Quali

Multinomial Logit Models for Variable Response Categories Ordered

Bayesian Hierarchical/ Multilevel and Latent-Variable (Random-Effects) Modeling

Modeling skewness and kurtosis in Stochastic Volatility Models

ก ก ก ก ก ก ก. ก (Food Safety Risk Assessment Workshop) 1 : Fundamental ( ก ( NAC 2010)) 2 3 : Excel and Statistics Simulation Software\

STATISTICAL METHODS FOR CATEGORICAL DATA ANALYSIS

Analysis of Microdata

A Bayesian model for classifying all differentially expressed proteins simultaneously in 2D PAGE gels

High-Frequency Data Analysis and Market Microstructure [Tsay (2005), chapter 5]

book 2014/5/6 15:21 page 261 #285

Distribution of state of nature: Main problem

Idiosyncratic risk, insurance, and aggregate consumption dynamics: a likelihood perspective

The Multinomial Logit Model Revisited: A Semiparametric Approach in Discrete Choice Analysis

Relevant parameter changes in structural break models

Contents. An Overview of Statistical Applications CHAPTER 1. Contents (ix) Preface... (vii)

Chapter 7: Estimation Sections

Laplace approximation

Cambridge University Press Risk Modelling in General Insurance: From Principles to Practice Roger J. Gray and Susan M.

Logit Models for Binary Data

درس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی

KARACHI UNIVERSITY BUSINESS SCHOOL UNIVERSITY OF KARACHI BS (BBA) VI

Sample Size Calculations for Odds Ratio in presence of misclassification (SSCOR Version 1.8, September 2017)

Jacob: The illustrative worksheet shows the values of the simulation parameters in the upper left section (Cells D5:F10). Is this for documentation?

Exam 2 Spring 2015 Statistics for Applications 4/9/2015

Subject CS2A Risk Modelling and Survival Analysis Core Principles

Supplementary Material: Strategies for exploration in the domain of losses

Construction and behavior of Multinomial Markov random field models

Statistical Computing (36-350)

Master s in Financial Engineering Foundations of Buy-Side Finance: Quantitative Risk and Portfolio Management. > Teaching > Courses

Estimation of a Ramsay-Curve IRT Model using the Metropolis-Hastings Robbins-Monro Algorithm

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

Robust Loss Development Using MCMC: A Vignette

Five Things You Should Know About Quantile Regression

Key Features Asset allocation, cash flow analysis, object-oriented portfolio optimization, and risk analysis

A Comprehensive, Non-Aggregated, Stochastic Approach to Loss Development

ECON 214 Elements of Statistics for Economists 2016/2017

Gamma Distribution Fitting

is the bandwidth and controls the level of smoothing of the estimator, n is the sample size and

Transcription:

Bayesian Multinomial Model for Ordinal Data Overview This example illustrates how to fit a Bayesian multinomial model by using the built-in mutinomial density function (MULTINOM) in the MCMC procedure for categorical response data that are measured on an ordinal scale. By using built-in multivariate distributions, PROC MCMC can efficiently sample constrained multivariate parameters with random walk Metropolis algorithm. The example also demonstrates how use the MCMC procedure to compute posterior means, credible intervals, and posterior distributions of the parameters and odds ratios for the multinomial model. The SAS source code for this example is available as a text file attachment. In Adobe Acrobat, right-click the icon in the margin and select Save Embedded File to Disk. You can also double-click the icon to open the file immediately. Analysis Researchers study the results of a taste test on three different brands of ice cream. They want to assess the testers preference of the three brands. The taste of each brand is rated on a five-point scale from very good to very bad. the five points correspond to response variables Y1 through Y5, where Y1 represents very good and Y5 represents very bad. Response variables contain the number of taste testers who rate each brand in each category. The very bad taste level (Y5) is used as the reference response level. Two dummy variables, BRAND1 and BRAND2, are created to indicate Brands 1 and 2, respectively. Brand 3 is set as the reference level in this example and is represented in the data set when both of the dummy variables equal zero. The following statements create the ICECREAM data set: data icecream; input y1-y5 brand; if brand = 1 then brand1 = 1; else brand1 = 0; if brand = 2 then brand2 = 1; else brand2 = 0; keep y1-y5 brand1 brand2; datalines; 70 71 151 30 46 1 20 36 130 74 70 2 50 55 140 52 50 3 ;

12 Bayesian Multinomial Model Multinomial ordinal models occur frequently in applications such as food testing, survey response, or anywhere order matters in the categorical response. Categorical data with an ordinal response correspond to multinomial models based on cumulative response probabilities (McCullagh and Nelder 1989). In this data set, the ordered response variable is the taste tester s rating for a brand of ice cream. Let the random variable Y i D.Y i1 ; : : : ; Y ij / for brand i D 1; 2; 3, and let response level j D 1; : : : ; 5, be from an multinomial ordinal model with mutually exclusive, discrete response levels and probability mass function f.y i1 D y i1 ; : : : ; Y ij D y ij / D n i Š Q J j D1 y ij Š where y ij represents the number of people from ith brand in the j th response level. For the grouped data, let n i D P 5 j D1 y ij denote the number of testers who taste the ith brand of ice cream and let n D P 3 id1 n i. Let ij D P r.y i D j / denote the probability that the response of brand i falls into the j th response level, and let P J j D1 ij D 1. Let ij D P r.y i j / denote the corresponding cumulative probability that the response falls in the j th level or below, so ij D i1 C : : : C ij. The transformed cumulative probabilities are linear functions of the covariates written as g. ij / D j C X iˇ, where g./ refers to the logit link function, ˇ represents the effects for the covariates, and X i D fbrand1 i BRAND2 i g. Let j represent the baseline value of the transformed cumulative probability for category j such that the constraint j < j 1 holds for all j (Albert and Chib 1993). Then ij D g 1.X i ; ˇ; j / D logistic. j C X iˇ/ (11) and the group probabilities for the j th levels are as follows: JY j D1 y ij ij i1 D logistic. 1 C X iˇ/ (12) D i1 ij D logistic. j C X iˇ/ logistic. j 1 C X iˇ/ (13) ij D 1 D ij i.j 1/ for 1 j J JX 1 j D1 ij (14) The likelihood function for the counts and corresponding covariates is p.y i1 ; :::; Y i5 j 1 ; :::; 4 ; ˇ1; ˇ2; BRAND1 i ; BRAND2 i / D Multinomial. i1 ; : : : ; i5 / (15) where p.j/ denotes a conditional probability density. The multinomial density is evaluated at the specified value of Y i and the corresponding probabilities ij, which are defined in Equation 12 through 14. There are six parameters in the likelihood: the intercepts 1 through 4 and the regression parameters ˇ1 and ˇ2 that correspond to the relative Brand 1 and 2 effects, respectively.

Bayesian Multinomial Model 13 Suppose the following prior distributions are placed on the six parameters, where./ indicates a prior distribution and.j/ indicates a conditional prior distribution:. 1 / D normal.0; 2 D 100/ (16). 2 j 1 / D normal.0; 2 D 100; lower D 1 / (17). 3 j 2 / D normal.0; 2 D 100; lower D 2 / (18). 4 j 3 / D normal.0; 2 D 100; lower D 3 / (19).ˇ1/;.ˇ2/ D normal.0; 2 D 1000/ (20) The joint prior distribution of 1 through 4 is the product of Equation 16 through 19. The prior distributions in Equation 17 through 19 represent truncated normal distributions with mean 0, variance 100, and the designated lower bound. The lower bound ensures that the order restriction on is sustained. Using Bayes theorem, the likelihood function and prior distributions determine the posterior distribution of the parameters as follows:. 1 ; :::; 4 ; ˇ1; ˇ2jY i1 D y i1 ; :::; Y i5 D y i5 ; BRAND1 i ; BRAND2 i / / 4Y p.y i1 D y i1 ; :::; Y i5 D y i5 j 1 ; :::; 4 ; ˇ1; ˇ2; BRAND1 i ; BRAND2 i /.ˇ1/.ˇ2/. 1 /. j j j 1 / PROC MCMC obtains samples from the desired posterior distribution. You do not need to specify the exact form of the posterior distribution. The odds ratio for comparing one brand to another can be written as j D2 OR rs D exp.ˇr ˇs/ (21) for r; s 2 f1; 2; 3g. The odds ratio is useful for interpreting how the taste preference for the different brands of ice cream cpmpares. For this example, Brand 3 is set as the reference level, which implies that ˇ3 D 0. The following SAS statements fit the Bayesian multinomial ordinal model. The PROC MCMC statement invokes the procedure and specifies the input data set. The NBI= option specifies the number of burn-in iterations. The NMC= option specifies the number of posterior simulation iterations. The THIN=10 option specifies that one of every 10 samples is kept. The SEED= option specifies a seed for the random number generator (the seed guarantees the reproducibility of the random stream). The PROPCOV=QUANEW option uses the estimated inverse Hessian matrix as the initial proposal covariance matrix. The MONITOR= option outputs analysis on selected symbols of interest in the program. ods graphics on; proc mcmc data=icecream nbi=10000 nmc=25000 thin=10 seed=1181 propcov=quanew monitor=(beta1 beta2 or12 or13 or23); array data[5] y1 y2 y3 y4 y5; array theta[4]; array gamma[4]; array pi[5]; parms theta1-theta4 beta1 beta2; prior theta1 ~ normal(0,var=100);

14 prior theta2 ~ normal(0,var=100,lower=theta1); prior theta3 ~ normal(0,var=100,lower=theta2); prior theta4 ~ normal(0,var=100,lower=theta3); prior beta: ~ normal(0,var=1000); mu = beta1*brand1 + beta2*brand2; do j = 1 to 4; gamma[j] = logistic(theta[j] + mu); if j>=2 then pi[j]=gamma[j]-gamma[j-1]; end; pi1 = gamma1; pi5 = 1 - sum(of pi1-pi4); model data~multinom(pi); beginnodata; or12 = exp(beta1-beta2); or13 = exp(beta1); or23 = exp(beta2); endnodata; run; ods graphics off; Each of the ARRAY statements associate a name with a list of variables and constants. The first ARRAY statement declare the data array for response variables Y1 through Y5. The second ARRAY statement specifies names for the intercept parameters. The third ARRAY statement contains the ij parameters and the last ARRAY statement contains the ij parameters. The PARMS statement puts all and ˇ parameters in a single block. The PRIOR statements specify priors for the parameters as given in Equations 16 through 20. The MU assignment statement calculates X iˇ. The DO loop and coinciding GAMMA assignment statements calculate ij for j D 1; :::; 4 as in Equation 11. The five PI assignment statements calculate the individual probabilities that an observation falls into the j th response level as in Equation 12 to Equation 14. For SAS/STAT 9.3 and later, the MODEL statement supports the multinomial density function (MULTI- NOM). Hence, it is used to construct the likelihood function for the response variables Y1 through Y5 and the model parameters 1 ; : : : ; 5 as in Equation 15. Note that MULTINOM is not supported by the PRIOR and HYPERPRIOR statements. However, you can still declare multinomial prior by using the GENERAL function and SAS programming statements that use LOGMPDFMULTINOM. The statements within the BEGINNODATA and ENDNODATA statements calculate the three odds ratios for pairwise comparisons of ice cream brands according to Equation 21. The statements are enclosed within the BEGINNODATA and ENDNODATA block to reduce unnecessary observation-level computations. Figure 8 displays diagnostic plots to assess whether the Markov chains have converged.

Bayesian Multinomial Model 15 Figure 8 Diagnostic Plot for ˇ1 The trace plot in Figure 8 indicates that the chain appears to have reached a stationary distribution. It also has good mixing and is dense. The autocorrelation plot indicates low autocorrelation and efficient sampling. Finally, the kernel density plot shows the smooth, unimodal shape of posterior marginal distribution for ˇ1. The remaining diagnostic plots (not shown here) similarly indicate good convergence in the other parameters. Figure 9 displays a number of convergence diagnostics, including Monte Carlo standard errors, autocorrelations at selected lags, Geweke diagnostics, and the effective sample sizes. Figure 9 Multinomial Model MCMC Convergence Diagnostics The MCMC Procedure Monte Carlo Standard Errors Standard Parameter MCSE Deviation MCSE/SD beta1 0.00379 0.1337 0.0283 beta2 0.00449 0.1403 0.0320 or12 0.0119 0.4054 0.0294 or13 0.00562 0.1982 0.0284 or23 0.00236 0.0743 0.0317

16 Figure 9 continued Posterior Autocorrelations Parameter Lag 1 Lag 5 Lag 10 Lag 50 beta1 0.3284-0.0011 0.0618-0.0120 beta2 0.4089 0.0287 0.0232 0.0142 or12 0.3464 0.0059 0.0297-0.0029 or13 0.3260 0.0014 0.0607-0.0130 or23 0.3967 0.0338 0.0266 0.0144 Geweke Diagnostics Parameter z Pr > z beta1-0.1114 0.9113 beta2-1.5302 0.1260 or12 1.4710 0.1413 or13-0.1486 0.8819 or23-1.5676 0.1170 Effective Sample Sizes Autocorrelation Parameter ESS Time Efficiency beta1 1246.3 2.0060 0.4985 beta2 978.1 2.5560 0.3912 or12 1157.7 2.1595 0.4631 or13 1243.9 2.0098 0.4976 or23 992.7 2.5184 0.3971 Figure 10 reports summary and interval statistics for the regression parameters and odds ratios. The odds ratios provide the relative difference in one brand with respect to another and indicate whether there is a significant brand effect. The odds ratio for Brand 1 and Brand 2 is the multiplicative change in the odds of a taste tester preferring Brand 1 compared to the odds of the tester preferring Brand 2. The estimated odds ratio (OR 12 ) value is 2.8366 with a corresponding 95% equal-tail credible interval of.2:1196; 3:6740/. Similarly, the odds ratio for Brand 1 and Brand 3 is 1.4787 with a 95% equal-tail credible interval of.1:1202; 1:9048/. Finally, the odds ratio for Brand 2 compared to Brand 3 is 0.5271 with a 95% equal-tail credible interval of.0:3947; 0:6883/. The lower categories indicate the favorable taste results; so Brand 1 scored significantly better when compared to Brand 2 or 3. Brand 2 scored less favorably when compared to Brand 3.

References 17 Figure 10 Multinomial Model Summary and Interval Statistics The MCMC Procedure Posterior Summaries Standard Percentiles Parameter N Mean Deviation 25% 50% 75% beta1 2500 0.3822 0.1337 0.2957 0.3811 0.4718 beta2 2500-0.6502 0.1403-0.7456-0.6488-0.5590 or12 2500 2.8366 0.4054 2.5518 2.8079 3.1088 or13 2500 1.4787 0.1982 1.3440 1.4638 1.6029 or23 2500 0.5271 0.0743 0.4745 0.5227 0.5718 Posterior Intervals Parameter Alpha Equal-Tail Interval HPD Interval beta1 0.050 0.1135 0.6443 0.1081 0.6344 beta2 0.050-0.9297-0.3736-0.9357-0.3828 or12 0.050 2.1196 3.6740 2.0851 3.6101 or13 0.050 1.1202 1.9047 1.1059 1.8741 or23 0.050 0.3947 0.6883 0.3849 0.6714 References Albert, J. H. and Chib, S. (1993), Bayesian Analysis of Binary and Polychotomous Response Data, Journal of the American Statistical Association, 88(422), 669 679. McCullagh, P. and Nelder, J. A. (1989), Generalized Linear Models, Second Edition, London: Chapman & Hall.