Weight Smoothing with Laplace Prior and Its Application in GLM Model

Similar documents
Non-informative Priors Multiparameter Models

1 Bayesian Bias Correction Model

Posterior Inference. , where should we start? Consider the following computational procedure: 1. draw samples. 2. convert. 3. compute properties

Nonresponse Adjustment of Survey Estimates Based on. Auxiliary Variables Subject to Error. Brady T. West. University of Michigan, Ann Arbor, MI, USA

A potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in different subsamples

Chapter 7: Estimation Sections

An Introduction to Bayesian Inference and MCMC Methods for Capture-Recapture

Equity correlations implied by index options: estimation and model uncertainty analysis

Stochastic Volatility (SV) Models

Chapter 7: Estimation Sections

Outline. Review Continuation of exercises from last time

SELECTION OF VARIABLES INFLUENCING IRAQI BANKS DEPOSITS BY USING NEW BAYESIAN LASSO QUANTILE REGRESSION

Hierarchical Generalized Linear Models. Measurement Incorporated Hierarchical Linear Models Workshop

STAT 425: Introduction to Bayesian Analysis

Conjugate Models. Patrick Lam

Objective Bayesian Analysis for Heteroscedastic Regression

Statistical and Computational Inverse Problems with Applications Part 5B: Electrical impedance tomography

1. You are given the following information about a stationary AR(2) model:

Sensitivity Analysis for Unmeasured Confounding: Formulation, Implementation, Interpretation

Bayesian Linear Model: Gory Details

Duangporn Jearkpaporn, Connie M. Borror Douglas C. Montgomery and George C. Runger Arizona State University Tempe, AZ

Missing Data. EM Algorithm and Multiple Imputation. Aaron Molstad, Dootika Vats, Li Zhong. University of Minnesota School of Statistics

Microeconomic Foundations of Incomplete Price Adjustment

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER

Chapter 8: Sampling distributions of estimators Sections

Forecasting jumps in conditional volatility The GARCH-IE model

Quantitative Risk Management

Statistical Inference and Methods

**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:

Down-Up Metropolis-Hastings Algorithm for Multimodality

Machine Learning for Quantitative Finance

Bayesian Hierarchical/ Multilevel and Latent-Variable (Random-Effects) Modeling

1 Explaining Labor Market Volatility

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMSN50)

ST440/550: Applied Bayesian Analysis. (5) Multi-parameter models - Summarizing the posterior

Dependence Structure and Extreme Comovements in International Equity and Bond Markets

A Two-Step Estimator for Missing Values in Probit Model Covariates

CS340 Machine learning Bayesian model selection

Session 5. A brief introduction to Predictive Modeling

Chapter 7: Estimation Sections

Choice Probabilities. Logit Choice Probabilities Derivation. Choice Probabilities. Basic Econometrics in Transportation.

Semiparametric Modeling, Penalized Splines, and Mixed Models David Ruppert Cornell University

EX-POST VERIFICATION OF PREDICTION MODELS OF WAGE DISTRIBUTIONS

Reading the Tea Leaves: Model Uncertainty, Robust Foreca. Forecasts, and the Autocorrelation of Analysts Forecast Errors

Chapter 4: Asymptotic Properties of MLE (Part 3)

Online Appendix to ESTIMATING MUTUAL FUND SKILL: A NEW APPROACH. August 2016

Adaptive Experiments for Policy Choice. March 8, 2019

Investment strategies and risk management for participating life insurance contracts

Estimating Term Structure of U.S. Treasury Securities: An Interpolation Approach

Multivariate Cox PH model with log-skew-normal frailties

An Improved Skewness Measure

Analyzing Oil Futures with a Dynamic Nelson-Siegel Model

Longitudinal Modeling of Insurance Company Expenses

High Dimensional Bayesian Optimisation and Bandits via Additive Models

Short-selling constraints and stock-return volatility: empirical evidence from the German stock market

Application of MCMC Algorithm in Interest Rate Modeling

Robust Regression for Capital Asset Pricing Model Using Bayesian Approach

Discussion of The Term Structure of Growth-at-Risk

Semiparametric Modeling, Penalized Splines, and Mixed Models

Introduction to Sequential Monte Carlo Methods

Conjugate Bayesian Models for Massive Spatial Data

Extended Model: Posterior Distributions

Geostatistical Inference under Preferential Sampling

Modeling skewness and kurtosis in Stochastic Volatility Models

Financial Risk Management

Distribution of state of nature: Main problem

Module 2: Monte Carlo Methods

Applied Statistics I

A Structural Model of Continuous Workout Mortgages (Preliminary Do not cite)

On modelling of electricity spot price

PARAMETRIC AND NON-PARAMETRIC BOOTSTRAP: A SIMULATION STUDY FOR A LINEAR REGRESSION WITH RESIDUALS FROM A MIXTURE OF LAPLACE DISTRIBUTIONS

PhD Qualifier Examination

Practice Exam 1. Loss Amount Number of Losses

Laplace approximation

VARIANCE ESTIMATION FROM CALIBRATED SAMPLES

Statistical estimation

Comparison of design-based sample mean estimate with an estimate under re-sampling-based multiple imputations

Lecture 17: More on Markov Decision Processes. Reinforcement learning

Lecture 13 Price discrimination and Entry. Bronwyn H. Hall Economics 220C, UC Berkeley Spring 2005

Much of what appears here comes from ideas presented in the book:

Bayesian Inference for Random Coefficient Dynamic Panel Data Models

Estimating Macroeconomic Models of Financial Crises: An Endogenous Regime-Switching Approach

Actuarial Mathematics and Statistics Statistics 5 Part 2: Statistical Inference Tutorial Problems

Online Appendix. Revisiting the Effect of Household Size on Consumption Over the Life-Cycle. Not intended for publication.

Analysis of extreme values with random location Abstract Keywords: 1. Introduction and Model

INTERTEMPORAL ASSET ALLOCATION: THEORY

A Multivariate Analysis of Intercompany Loss Triangles

Multiple-Population Moment Estimation: Exploiting Inter-Population Correlation for Efficient Moment Estimation in Analog/Mixed-Signal Validation

Estimating Mixed Logit Models with Large Choice Sets. Roger H. von Haefen, NC State & NBER Adam Domanski, NOAA July 2013

Financial Time Series and Their Characterictics

Bootstrap Inference for Multiple Imputation Under Uncongeniality

Stochastic Models. Statistics. Walt Pohl. February 28, Department of Business Administration

Calibration of Interest Rates

Chapter 5 Univariate time-series analysis. () Chapter 5 Univariate time-series analysis 1 / 29

Portfolio Management and Optimal Execution via Convex Optimization

Lecture 10: Point Estimation

Estimating a Dynamic Oligopolistic Game with Serially Correlated Unobserved Production Costs. SS223B-Empirical IO

One period models Method II For working persons Labor Supply Optimal Wage-Hours Fixed Cost Models. Labor Supply. James Heckman University of Chicago

Labor income and the Demand for Long-Term Bonds

DSGE model with collateral constraint: estimation on Czech data

Transcription:

Weight Smoothing with Laplace Prior and Its Application in GLM Model Xi Xia 1 Michael Elliott 1,2 1 Department of Biostatistics, 2 Survey Methodology Program, University of Michigan National Cancer Institute Grant R01-CA129101 November 4, 2013 Xi Xia 1 Michael Elliott 1,2 ( 1 DepartmentWeight of Biostatistics, Smoothing 2 Survey with Laplace Methodology Prior and Program, Its Application University November inof GLMichigan Model 4, 2013National 1 / Cance 17

Outline Background Weighting in Complex Survey Design Weight Trimming Bayesian Finite Population Inference Weight Smoothing with Laplace Prior Weight Smoothing Laplace Prior Simulation and Application Simulation Linear Regression Application: Dioxin study from NHANES Conclusion and Discussion Xi Xia 1 Michael Elliott 1,2 ( 1 DepartmentWeight of Biostatistics, Smoothing 2 Survey with Laplace Methodology Prior and Program, Its Application University November inof GLMichigan Model 4, 2013National 2 / Cance 17

Background Weighting in Complex Survey Design Weighting in Complex Survey Design When target quantity of interest is correlated with probabilities of inclusion, applying weights inverse to probabilities of inclusion in estimation is common measure to eliminate or reduce bias. Some examples are the Horvitz-Thompson estimators of population total and mean: Ŷ HT = n π 1 i Y i i=1 ˆµ HT = N 1 n i=1 π 1 i Y i When data are not closely associated with probability of inclusion, incorporating weights increases the variance of estimation due to extra variability in weights. Xi Xia 1 Michael Elliott 1,2 ( 1 DepartmentWeight of Biostatistics, Smoothing 2 Survey with Laplace Methodology Prior and Program, Its Application University November inof GLMichigan Model 4, 2013National 3 / Cance 17

Background Weight Trimming Weight Trimming A common approach to cope with inflated estimation variance is weight trimming or winsorization (Potter 1990, Kish 1992, Alexander et al. 1997) Concept: To limit the variability in weights by trimming extreme weights down to a threshold, and redistributing trimmed values among others. Target: To reduce variance at cost of increased bias, lead to overall reduction in RMSE. Examples: NAEP (Potter 1988), Empirical MSE(Cox and McGrath 1981), Exponential Distribution Method (Chowdbury et al. 2007) Xi Xia 1 Michael Elliott 1,2 ( 1 DepartmentWeight of Biostatistics, Smoothing 2 Survey with Laplace Methodology Prior and Program, Its Application University November inof GLMichigan Model 4, 2013National 4 / Cance 17

Background Bayesian Finite Population Inference Bayesian Inference Approach Treat unobserved sample (Y nob ) as missing, and build model(p(y θ)) that captures underlying data pattern. To estimate quantity of interest Q(Y ), e.g population mean or slope, from marginal posterior predictive distribution (Ericson 1969, Holt and Smith 1979, Little 1993): p(q(y ) y) = f (Q(Y ) θ)p(θ y)dθ = f (Q(Y ) θ)f (y θ)p(θ)dθ f (y θ)p(θ)dθ Under ignorable sampling design (p(i Y, φ) = p(i Y obs, φ)), p(y nob Y obs, I ) = p(y nob Y obs ), allowing inference about Q(Y ) without explicitly modeling the sampling inclusion parameter I. (Ericson 1969, Holt and Smith 1979, Little 1993, Rubin 1987, Skinner et al. 1989) Sensible models in still need to account for the sample design in both the likelihood and prior model structure. Xi Xia 1 Michael Elliott 1,2 ( 1 DepartmentWeight of Biostatistics, Smoothing 2 Survey with Laplace Methodology Prior and Program, Its Application University November inof GLMichigan Model 4, 2013National 5 / Cance 17

Background Bayesian Finite Population Inference Incorporating Unequal Probabilities of inclusion Pool samples with same or similar probabilities of inclusion in strata, index by h=1,...h, and re-assign weight as w h = N h /n h, where n h =sample size in weight stratum h, and N h =population size in weight stratum h. Model data by: y hi θ h f (y hi ; θ h ), i = 1,...N h for all elements in hth inclusion stratum, and θ h allows for interaction between model parameter(s) and inclusion stratum h. Noninformative prior on θ h represents a fully-weighted analysis on expectation of the posterior predictive distribution of Q(Y). Xi Xia 1 Michael Elliott 1,2 ( 1 DepartmentWeight of Biostatistics, Smoothing 2 Survey with Laplace Methodology Prior and Program, Its Application University November inof GLMichigan Model 4, 2013National 6 / Cance 17

Weight Smoothing with Laplace Prior Weight Smoothing Weight Smoothing Follows the idea of modeling parameter and stratum interaction, but treat strata means as random effects in a hierarchical model to achieve shrinkage estimator between fully-weighted estimate and unweight estimate Corresponding hierarchical model: Y hi iid N(µh, σ 2 ) µ N H (φ, G) where µ = (µ 1,...µ H ), φ = (φ 1,...φ H ), and h = 1,..., H indexes different weight strata defined The posterior mean of the population mean is derived as: H E(Ȳ y) = [n h ȳ h + (N h n h )ˆµ h ]/N h=1 Xi Xia 1 Michael Elliott 1,2 ( 1 DepartmentWeight of Biostatistics, Smoothing 2 Survey with Laplace Methodology Prior and Program, Its Application University November inof GLMichigan Model 4, 2013National 7 / Cance 17

Weight Smoothing with Laplace Prior Weight Smoothing Weight Smoothing for Generalized Linear Models To extend weight smoothing model to GLM: Basic form of GLM: [ ] f (y i θ i, φ) = exp yi θ i b(θ i ) a i (φ) + c(y i, φ) Link Function: g(e(y i θ i )) = g(µ i ) = g(b (θ i )) = η i = x T i β Random effect β: (β T 1,...βT H )T β, G N HP (β, G) Population Quantity B approximated by: H h=1 W nh (ŷ hi g 1 (µ i ( ˆB)))x hi h i=1 V (µ hi ( ˆB))g (µ hi ( ˆB)) = 0 Xi Xia 1 Michael Elliott 1,2 ( 1 DepartmentWeight of Biostatistics, Smoothing 2 Survey with Laplace Methodology Prior and Program, Its Application University November inof GLMichigan Model 4, 2013National 8 / Cance 17

Weight Smoothing with Laplace Prior Laplace Prior Laplace Prior Inspired by the choice of Laplace prior in Bayesian LASSO(Park & Casella 2008), we apply Laplace prior in weight smoothing model. Comparison between Normal prior and Laplace Prior Normal Prior: p p(β σ 2 1 ) = 2πσ 2 e β2 j /2σ2 j=1 Conditional Laplace Prior: p p(β σ 2 ) = j=1 λ 2 σ 2 e λ β j / σ 2 Expect to gain robustness by switching from L2 constraint to L1 constraint. Xi Xia 1 Michael Elliott 1,2 ( 1 DepartmentWeight of Biostatistics, Smoothing 2 Survey with Laplace Methodology Prior and Program, Its Application University November inof GLMichigan Model 4, 2013National 9 / Cance 17

Weight Smoothing with Laplace Prior Laplace Prior Laplace Prior The absolute value in Laplace distribution raises problems in optimization. The problem is solved by reform Laplace distribution into a scale mixture of normal with an exponential mixing density: (Andrews and Mallows 1974) α 2 e α z = 0 1 2πs e z2 /(2s) α 2 2 e α2 s/2 ds And Laplace prior turns into a two-level hierarchical model: (β1 T,..., βt H )T βh, D τ, σ 2 MVN(βh, σ2 D τh ) σ 2, τ1 2,...τ Hp 2 Hp 1/σ2 j=1 λ 2 2 e λ2 τ 2 j /2 Xi Xia 1 Michael Elliott 1,2 ( 1 DepartmentWeight of Biostatistics, Smoothing 2 Survey with Laplace Methodology Prior and Program, Its Application University November inof GLMichigan 4, Model 2013 National 10 / Cance 17

Weight Smoothing with Laplace Prior Laplace Prior Weight Smoothing with Laplace Prior The overall hierarchical model for weight smoothing model with Laplace prior is presented as following: y hi x hi, β h, σ 2 N(x T hi β h, σ 2 ) (β T 1,..., β T H )T β h, D τ, σ 2 MVN(β h, σ2 D τh ) β h σ2 0 MVN(0, σ 2 0I p ) D τh = diag(τ 2 h1,..., τ 2 hp ) σ 2, τ1 2,...τHp 2 Hp 1/σ2 j=1 λ 2 2 e λ2 τ 2 j /2 λ 2 Gamma(γ = 1, δ = 1.78) The close forms for all full conditional distributions exist, and the model could be simulated through Gibbs steps. Xi Xia 1 Michael Elliott 1,2 ( 1 DepartmentWeight of Biostatistics, Smoothing 2 Survey with Laplace Methodology Prior and Program, Its Application University November inof GLMichigan 4, Model 2013 National 11 / Cance 17

Simulation and Application Simulation Linear Regression Simulation: Population Setting y i x i, β, σ 2 N(β 0 + 20 h=1 β h (x i h) +, σ 2 ) x i UNI (0, 10), i = 1,..., N = 20, 000. 1.β a = (0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 2.β b = (0, 0, 0, 0, 0, 0,.5,.5,.5,.5, 1, 1, 1, 1, 2, 2, 2, 2, 4, 4, 4) 3.β c = (0, 22, 4, 4, 2, 2, 2, 2, 1, 1, 1, 1,.5,.5,.5,.5, 0, 0, 0, 0, 0) σ 2 = 10 l, l = 1, 3, 5 P(I i H i ) = π i (1 + H i /15)H i. H i = [2X i ]/2 Xi Xia 1 Michael Elliott 1,2 ( 1 DepartmentWeight of Biostatistics, Smoothing 2 Survey with Laplace Methodology Prior and Program, Its Application University November inof GLMichigan 4, Model 2013 National 12 / Cance 17

Simulation and Application Simulation Linear Regression Goal, Sampling and Simulation Details Goal: To estimate population slope B Sample Size: n = 1000 Simulation Count: 200 Data-based prior for β 50,000 iterations with 10,000 burn-in Compare weight smoothing with Laplace prior(hws) with unweighted estimate(uwt), fully weighted estimate(fwt), weight smoothing with exchangeable random effect(xrs): y hi x hi, β h, σ 2 N(x T hi β h, σ 2 ) (β T 1,..., β T H )T β, Σ MVN(β, Σ) p(σ, β, Σ) σ 2 Σ (p+1/2) exp( 1/2tr{2Σ 1 }) Xi Xia 1 Michael Elliott 1,2 ( 1 DepartmentWeight of Biostatistics, Smoothing 2 Survey with Laplace Methodology Prior and Program, Its Application University November inof GLMichigan 4, Model 2013 National 13 / Cance 17

Simulation and Application Simulation Linear Regression Simulation Result Table 1: RMSE relative to fully weighted estimator (nominal 95% CI coverage in parenthesis) HWS 1.05 (96) 0.98 (95) 0.77 (99) 0.45 (97) 0.95 (96) 0.77 (99) 0.34 (85) 0.94 (96) 0.77 (99) β a β b β c Variance log 10 Variance log 10 Variance log 10 1 3 5 1 3 5 1 3 5 UWT 0.73 (95) 0.69 (95) 0.72 (96) 10.20 (0) 2.44 (2) 0.73 (95) 6.23 (0) 2.18 (1) 0.76 (90) WT 1 (94) 1 (93) 1 (96) 1 (100) 1 (92) 1 (96) 1 (100) 1 (97) 1 (95) XRS 1.49 (99) 0.72 (96) 0.72 (96) 1.01 (95) 2.21 (94) 1.20 (94) 1.87 (6) 2.05 (1) 0.76 (91) Xi Xia 1 Michael Elliott 1,2 ( 1 DepartmentWeight of Biostatistics, Smoothing 2 Survey with Laplace Methodology Prior and Program, Its Application University November inof GLMichigan 4, Model 2013 National 14 / Cance 17

Simulation and Application Application: Dioxin study from NHANES Application: Dioxin study from NHANES We present the performance of weight smoothing model with Laplace prior on Dioxin data from 2003-2004 NHANES study. The target is to estimate the linear effect of Age, Gender on log TCDD in blood. Altogether 1250 individuals sampled from 25 Strata, 2 MVU each. Reading below measurement threshold is corrected with Multiple Imputation, resulting in 5 replicates. Xi Xia 1 Michael Elliott 1,2 ( 1 DepartmentWeight of Biostatistics, Smoothing 2 Survey with Laplace Methodology Prior and Program, Its Application University November inof GLMichigan 4, Model 2013 National 15 / Cance 17

Simulation and Application Application: Dioxin study from NHANES Application: Dioxin study from NHANES Table 2: Relative RMSE for Dioxin study Model Age only Gender only Age and Gender Age Gender UWT 0.840 1.960 0.846 1.464 FWT 1 1 1 1 HWS 0.312 0.953 0.315 0.919 Model Age and Gender Interaction Age Gender Interaction UWT 1.412 0.488 0.448 FWT 1 1 1 HWS 0.770 0.393 0.364 Xi Xia 1 Michael Elliott 1,2 ( 1 DepartmentWeight of Biostatistics, Smoothing 2 Survey with Laplace Methodology Prior and Program, Its Application University November inof GLMichigan 4, Model 2013 National 16 / Cance 17

Conclusion and Discussion Conclusion and Discussion By applying Laplace prior, the weight smoothing model is able to obtain robust estimator with less complicated structure, leading to a faster algorithm. The Bayesian finite population inference provide more than just a shrinkage estimator between fully weighted estimate and unweighted estimate. In some situation, it could provide estimate with overall smaller RMSE than both. Extensions to GLM (logistic regression) have been done. Less savings on RMSE (10-15%) Coverage similar to fully-weighted estimator (both substantially undercover when weight/slope correlation is weak). The gaining in RMSE sometimes comes with a cost of moderate drop in 95% coverage. It is worth exploring the model s mechanism in reducing the RMSE and the limit of the scenarios under which it still maintains reasonable converage. Xi Xia 1 Michael Elliott 1,2 ( 1 DepartmentWeight of Biostatistics, Smoothing 2 Survey with Laplace Methodology Prior and Program, Its Application University November inof GLMichigan 4, Model 2013 National 17 / Cance 17