Negative Binomial Regression Analysis And other count models

Similar documents
Notes are not permitted in this examination. Do not turn over until you are told to do so by the Invigilator.

MgtOp 215 Chapter 13 Dr. Ahn

3/3/2014. CDS M Phil Econometrics. Vijayamohanan Pillai N. Truncated standard normal distribution for a = 0.5, 0, and 0.5. CDS Mphil Econometrics

Which of the following provides the most reasonable approximation to the least squares regression line? (a) y=50+10x (b) Y=50+x (d) Y=1+50x

Chapter 5 Student Lecture Notes 5-1

CHAPTER 9 FUNCTIONAL FORMS OF REGRESSION MODELS

Testing for Omitted Variables

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE)

ASSESSING GOODNESS OF FIT OF GENERALIZED LINEAR MODELS TO SPARSE DATA USING HIGHER ORDER MOMENT CORRECTIONS

Craig Meisner and Hua Wang Development Research Group The World Bank. and. Benoît Laplante Independent Consultant, Montreal, Canada

A Comparison of Statistical Methods in Interrupted Time Series Analysis to Estimate an Intervention Effect

II. Random Variables. Variable Types. Variables Map Outcomes to Numbers

Module Contact: Dr P Moffatt, ECO Copyright of the University of East Anglia Version 2

Sampling Distributions of OLS Estimators of β 0 and β 1. Monte Carlo Simulations

Statistical issues in traffic accident modeling

Data Mining Linear and Logistic Regression

Random Variables. b 2.

Tests for Two Correlations

4. Greek Letters, Value-at-Risk

Correlations and Copulas

Mode is the value which occurs most frequency. The mode may not exist, and even if it does, it may not be unique.

Chapter 3 Student Lecture Notes 3-1

/ Computational Genomics. Normalization

Standardization. Stan Becker, PhD Bloomberg School of Public Health

Elton, Gruber, Brown and Goetzmann. Modern Portfolio Theory and Investment Analysis, 7th Edition. Solutions to Text Problems: Chapter 4

CrimeStat Version 3.3 Update Notes:

Creation of Synthetic Discrete Response Regression Models

Chapter 3 Descriptive Statistics: Numerical Measures Part B

Calibration Methods: Regression & Correlation. Calibration Methods: Regression & Correlation

Introduction to PGMs: Discrete Variables. Sargur Srihari

Capability Analysis. Chapter 255. Introduction. Capability Analysis

Solutions to Odd-Numbered End-of-Chapter Exercises: Chapter 12

EDC Introduction

Tests for Two Ordered Categorical Variables

occurrence of a larger storm than our culvert or bridge is barely capable of handling? (what is The main question is: What is the possibility of

Creating synthetic discrete-response regression models

Multifactor Term Structure Models

Spatial Variations in Covariates on Marriage and Marital Fertility: Geographically Weighted Regression Analyses in Japan

Appendix - Normally Distributed Admissible Choices are Optimal

Probability Distributions. Statistics and Quantitative Analysis U4320. Probability Distributions(cont.) Probability

Likelihood Fits. Craig Blocker Brandeis August 23, 2004

Price Formation on Agricultural Land Markets A Microstructure Analysis

Evaluating Performance

PASS Sample Size Software. :log

Midterm Exam. Use the end of month price data for the S&P 500 index in the table below to answer the following questions.

3: Central Limit Theorem, Systematic Errors

Natural Resources Data Analysis Lecture Notes Brian R. Mitchell. IV. Week 4: A. Goodness of fit testing

Using Conditional Heteroskedastic

Economic Design of Short-Run CSP-1 Plan Under Linear Inspection Cost

An Application of Alternative Weighting Matrix Collapsing Approaches for Improving Sample Estimates

Interval Estimation for a Linear Function of. Variances of Nonnormal Distributions. that Utilize the Kurtosis

A Utilitarian Approach of the Rawls s Difference Principle

A Methodology for Studying Child Mortality Differentials in Populations with Limited Death Registration. Claire Noël-Miller Douglas Ewbank

Table III. model Discriminant analysis Linear regression model Probit model

ISyE 512 Chapter 9. CUSUM and EWMA Control Charts. Instructor: Prof. Kaibo Liu. Department of Industrial and Systems Engineering UW-Madison

Technological inefficiency and the skewness of the error component in stochastic frontier analysis

Simple Regression Theory II 2010 Samuel L. Baker

On estimating the location parameter of the selected exponential population under the LINEX loss function

Determinants of the dynamics of the European Union integration process: An ordered logit approach

Statistical Temporal Analysis of Freight-Train Derailment Rates in the United States: 2000 to 2012

Labor Market Transitions in Peru

Measures of Spread IQR and Deviation. For exam X, calculate the mean, median and mode. For exam Y, calculate the mean, median and mode.

Estimation of Wage Equations in Australia: Allowing for Censored Observations of Labour Supply *

International ejournals

Efficient Sensitivity-Based Capacitance Modeling for Systematic and Random Geometric Variations

Foundations of Machine Learning II TP1: Entropy

General Examination in Microeconomic Theory. Fall You have FOUR hours. 2. Answer all questions

The Integration of the Israel Labour Force Survey with the National Insurance File

Self-controlled case series analyses: small sample performance

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Dr. Wayne A. Taylor

1 Omitted Variable Bias: Part I. 2 Omitted Variable Bias: Part II. The Baseline: SLR.1-4 hold, and our estimates are unbiased

Maturity Effect on Risk Measure in a Ratings-Based Default-Mode Model

Graphical Methods for Survival Distribution Fitting

The Mack-Method and Analysis of Variability. Erasmus Gerigk

Linear Combinations of Random Variables and Sampling (100 points)

Local Maxima in the Estimation of the ZINB and Sample Selection models

CHAPTER 3: BAYESIAN DECISION THEORY

Risk and Return: The Security Markets Line

Basket options and implied correlations: a closed form approach

A Comparative Study of Mean-Variance and Mean Gini Portfolio Selection Using VaR and CVaR

Understanding price volatility in electricity markets

Microeconometric Analysis of the Determinants of Savings Bahaviour in Zimbabwe:

Final Exam. 7. (10 points) Please state whether each of the following statements is true or false. No explanation needed.

Conditional Beta Capital Asset Pricing Model (CAPM) and Duration Dependence Tests

Elements of Economic Analysis II Lecture VI: Industry Supply

Asset Management. Country Allocation and Mutual Fund Returns

Do organizations benefit or suffer from cultural and age diversity?

An Approximate E-Bayesian Estimation of Step-stress Accelerated Life Testing with Exponential Distribution

Merton-model Approach to Valuing Correlation Products

Financial Development and Economic Growth: Evidence from Heterogeneous Panel Data of Low Income Countries

Benefits of Taiwan Coffee Festival

THE VOLATILITY OF EQUITY MUTUAL FUND RETURNS

Flight Delays, Capacity Investment and Welfare under Air Transport Supply-demand Equilibrium

Do Inequality-based Entry Barriers Deter The Formation of Female-Owned Firms In Nigeria?

Teaching Note on Factor Model with a View --- A tutorial. This version: May 15, Prepared by Zhi Da *

A Network Modeling Approach for the Optimization of Internet-Based Advertising Strategies and Pricing with a Quantitative Explanation of Two Paradoxes

Introduction to game theory

Trade Flows and Trade Policy Analysis. October 2013 Dhaka, Bangladesh

Quiz on Deterministic part of course October 22, 2002

TCOM501 Networking: Theory & Fundamentals Final Examination Professor Yannis A. Korilis April 26, 2002

Transcription:

Negatve Bnomal Regresson Analyss And other count models Asst. Prof. Nkom Thanomseng Department of Bostatstcs & Demography Faculty of Publc Health, Khon Kaen Unversty Emal: nkom@kku.ac.th Web: http://home.kku.ac.th/nkom Negatve Bnomal Regresson Analyss & other count Outlnes: Negatve Bnomal regresson Problem of Zero Counts Zero nflated Posson (zp Zero nflated negatve Bnomal (znb Comparson of Models Test of Comparatve Ft Other count data models 204 Department of Bostatstcs & Demography, Faculty of Publc Health, Khon Kaen Unversty Negatve Bnomal Regresson Analyss Negatve Bnomal Regresson (NB The earlest defntons of the negatve bnomal are based on the bnomal PDF. NB2 (Cameron and Trved, 986, NB2 s derved from a Posson gamma mxture dstrbuton. NB, The NB model can also be derved as a form of Posson gamma mxture, but wth dfferent propertes resultng n a lnear varance. The negatve bnomal model, as a Posson gamma mxture model, s approprate to use when the overdsperson n an otherwse Posson model s thought to take the form of a gamma shape or dstrbuton. A more general class of negatve bnomal models wth mean μ and varance functon (μ + αμ p. NB2 wth p = 2, NB wth p=. Negatve Bnomal Regresson Analyss Negatve Bnomal Regresson (NB2 NB2 (Cameron and Trved, 986, NB2 s derved from a Posson gamma mxture dstrbuton. The NB2 model, wth p = 2, s the standard formulaton of the negatve bnomal model NB2 varance functon μ + αμ 2 It has densty. y ( y f ( y, ( ( y 0, y 0,, 2, Ths reduces to the Posson f α = 0 Negatve Bnomal Regresson Analyss Negatve Bnomal Regresson (NB2 The log-lkelhood functon for NB2 ln L(, n y ln( j ( y ln( exp( x y ln y x j0 ln y! NB, The NB model can also be derved as a form of Posson gamma mxture, but wth dfferent propertes resultng n a lnear varance. The negatve bnomal model, as a Posson gamma mxture model, s approprate to use when the overdsperson n an otherwse Posson model s thought to take the form of a gamma shape or dstrbuton. Negatve Bnomal Regresson Analyss Negatve Bnomal Regresson (NB2: Example A comparson of fnancal performance, organzatonal characterstcs and management strategy among rural & urban faclltes. (Smth, HL., Pland, NF. & Fsher, N. J. Rural Health, 27-40, 992 Sample: Lcensed Nurse n=52 bed = number of beds n home, tdays = annual total patent days (n hundreds pcrev = annual total patent care revenue(n $ mllons nsal = annual nursng salares(n $ mllons fexp = annual facltes expendtures(n $ mllons rural = ( = rural; 0 = nonrural

Negatve Bnomal Regresson Analyss Negatve Bnomal Regresson (NB2: nbreg. nbreg bed pcrev nsal fexp rural pn pf nf, exp(tdays d(mean Fttng Posson model: Negatve bnomal regresson Number of obs = 52 LR ch2(7 = 7.60 Dsperson = mean Prob > ch2 = 0.039 Log lkelhood = -223.23966 Pseudo R2 = 0.0379 -------- bed pcrev -.3868934.543459-2.5 0.02 -.6894058 -.0843809 nsal.556637.99432 0.7 0.866 -.646388.95776 fexp.42980.5777 2.79 0.005.4267365 2.432866 rural -.939.0704735 -.69 0.090 -.2574375.08837 pn.3323483.293388.3 0.257 -.242688.9073784 pf.753993.564349.46 0.45 -.2589945.765393 nf -4.56582 2.00498-2.28 0.023-8.495509 -.636308 _cons -.903272.988939-4.58 0.000 -.30052 -.5205023 tdays (exposure /lnalpha -3.50560.274876-4.037707-2.973495 alpha.0300287.008524.076379.05243 -------- Lkelhood-rato test of alpha=0: chbar2(0 = 82.36 Prob>=chbar2 = 0.000 Neagatve Bnomal Regresson Analyss Negatve Bnomal Regresson (NB2: glm. glm bed pcrev nsal fexp rural pn pf nf, exp(tdays f(nb.0300287 l(log Iteraton 0: log lkelhood = -223.40458 Iteraton : log lkelhood = -223.23965 Iteraton 2: log lkelhood = -223.23965 Generalzed lnear models No. of obs = 52 Optmzaton : ML Resdual df = 44 Scale parameter = Devance = 52.3722456 (/df Devance =.90278 Pearson = 57.5930065 (/df Pearson =.308932 Varance functon: V(u = u+(.0300287u^2 [Neg. Bnomal] Lnk functon : g(u = ln(u [Log] AIC = 8.893833 Log lkelhood = -223.23965 BIC = -2.4825 -------- OIM bed pcrev -.3868933.543257-2.5 0.02 -.689366 -.0844204 nsal.556692.99452 0.7 0.866 -.646352.95769 fexp.429802.56407 2.79 0.005.4270048 2.432599 rural -.932.0704696 -.69 0.090 -.2574299.088057 pn.3323467.2933803.3 0.257 -.242668.907365 pf.7532008.563957.46 0.45 -.25896.76538 nf -4.565827 2.004979-2.28 0.023-8.49554 -.636409 _cons -.903282.988345-4.58 0.000 -.300037 -.520697 tdays (exposure -------- Neagatve Bnomal Regresson Analyss Negatve Bnomal Regresson (NB2: glm (Stata +. glm bed pcrev nsal fexp rural pn pf nf, exp(tdays f(nb ml l(log Iteraton 0: log lkelhood = -223.40459 Iteraton : log lkelhood = -223.23966 Iteraton 2: log lkelhood = -223.23966 Generalzed lnear models No. of obs = 52 Optmzaton : ML Resdual df = 44 Scale parameter = Devance = 52.3722233 (/df Devance =.90278 Pearson = 57.59299049 (/df Pearson =.308932 Varance functon: V(u = u+(.03u^2 [Neg. Bnomal] Lnk functon : g(u = ln(u [Log] AIC = 8.893833 Log lkelhood = -223.239656 BIC = -2.4825 -------- OIM bed pcrev -.386893.543258-2.5 0.02 -.689366 -.084420 nsal.556643.99459 0.7 0.866 -.646358.957686 fexp.42980.56407 2.79 0.005.4270039 2.432599 rural -.932.0704696 -.69 0.090 -.2574298.088059 pn.3323478.2933805.3 0.257 -.2426674.907363 pf.753989.56396.46 0.45 -.258987.76536 nf -4.56589 2.00498-2.28 0.023-8.495507 -.636303 _cons -.903275.988346-4.58 0.000 -.300036 -.520688 ln(tdays (exposure -------- Note: Negatve bnomal parameter estmated va ML and treated as fxed once estmated. Negatve Bnomal Regresson Analyss Negatve Bnomal Regresson (NB2: Interpretaton usng the rate Methods of nterpretaton based on E(y x E( k e IRR E( The nterpretaton For a change of n x k f, the expected count ncreases by a factor of exp( k x, holdng all other varables constant. -For specfc values of Factor change. For a unt change n x k, the expected count changes by a factor of exp( k, holdng all other varables constant. Standardze factor change. For a standard devaton change to x k, the expected count changes by a factor of exp( k x s k, holdng all other varables constant. Negatve Bnomal Regresson Analyss Negatve Bnomal Regresson (NB2: Interpretaton usng percentage Alternatvely, the percentage change th the expected count for a unt change n x k, holdng other varables constant. Methods of nterpretaton based on E(y x E E( ( x 00 [exp E( x ( k k ] x 00 The nterpretaton For a factor x k, the expected count ncreases (decreases by n% [exp(k-]x00, holdng all other varables constant. Negatve Bnomal Regresson Analyss Negatve Bnomal Regresson (NB2: Interpretaton usng the rate. nbreg bed pcrev nsal fexp rural pn pf nf, exp(tdays d(mean rr Negatve bnomal regresson Number of obs = 52 LR ch2(7 = 7.60 Dsperson = mean Prob > ch2 = 0.039 Log lkelhood = -223.23965 Pseudo R2 = 0.0379 -------- bed IRR Std. Err. z P> z [95% Conf. Interval] pcrev.679633.04826-2.5 0.02.50874.990808 nsal.68439.074299 0.7 0.866.927459 7.08357 fexp 4.7787 2.3839 2.79 0.005.53225.3949 rural.8875309.0625474 -.69 0.090.7730299.08992 pn.394237.4090522.3 0.257.7845205 2.47784 pf 2.23788.096798.46 0.45.77829 5.843878 nf.00403.0208543-2.28 0.023.0002044.529334 tdays (exposure /lnalpha -3.50560.274876-4.037707-2.973495 alpha.0300287.008524.076379.05243 -------- Lkelhood-rato test of alpha=0: chbar2(0 = 82.36 Prob>=chbar2 = 0.000 2

Negatve Bnomal Regresson Analyss Negatve Bnomal Regresson (NB2: Interpretaton usng the rate. nbreg bed pcrev nsal fexp rural pn pf nf, exp(tdays d(mean rr. lstcoef,help nbreg (N=52: Factor Change n Expected Count Observed SD: 40.852732 bed b z P> z e^b e^bstdx SDofX pcrev -0.38689-2.507 0.02 0.6792 0.7635 0.6974 nsal 0.5567 0.69 0.866.684.0262 0.659 fexp.42980 2.794 0.005 4.779.324 0.949 rural -0.93 -.693 0.090 0.8875 0.9443 0.4804 pn 0.33235.33 0.257.3942.790 0.4954 pf 0.75320.458 0.45 2.238.3894 0.4366 nf -4.56583-2.277 0.023 0.004 0.598 0.49 ln alpha -3.50560 alpha 0.03003 SE(alpha = 0.0085 LR test of alpha=0: 82.36 Prob>=LRX2 = 0.000 b = raw coeffcent e^b = exp(b = factor change n expected count for unt ncrease n X e^bstdx = exp(b*sd of X = change n expected count for SD ncrease n X SDofX = standard devaton of X Negatve Bnomal Regresson Analyss Negatve Bnomal Regresson (NB2: Interpretaton usng the rate. nbreg bed pcrev nsal fexp rural pn pf nf, exp(tdays d(m rr. lstcoef,help percent nbreg (N=52: Percentage Change n Expected Count Observed SD: 40.852732 bed b z P> z % %StdX SDofX pcrev -0.38689-2.507 0.02-32. -23.6 0.6974 nsal 0.5567 0.69 0.866 6.8 2.6 0.659 fexp.42980 2.794 0.005 37.8 32. 0.949 rural -0.93 -.693 0.090 -.2-5.6 0.4804 pn 0.33235.33 0.257 39.4 7.9 0.4954 pf 0.75320.458 0.45 2.4 38.9 0.4366 nf -4.56583-2.277 0.023-99.0-40.8 0.49 ln alpha -3.50560 alpha 0.03003 SE(alpha = 0.0085 LR test of alpha=0: 82.36 Prob>=LRX2 = 0.000 b = raw coeffcent % = percent change n expected count for unt ncrease n X %StdX = percent change n expected count for SD ncrease n X SDofX = standard devaton of X Negatve Bnomal Regresson Analyss Negatve Bnomal Regresson (NB2: Interpretaton usng the rate Interpretaton based on Incdnce rate rato Beng a annual total patent care revenue decreases the expected number of beds n home by.6792, holdng all other varables constant. Interpreataton based on percentage Beng a annual total patent care revenue decreases the expected number of beds n home by 32.%, holdng all other varables constant. Negatve Bnomal Regresson Analyss Negatve Bnomal Regresson (NB NB, The NB model can also be derved as a form of Posson gamma mxture, but wth dfferent propertes resultng n a lnear varance. The NB model, whch sets p =, s also of nterest because t has the same varance functon, ( + αμ = μ, as that used n the GLM approach. The NB log-lkelhood functon s ln L(, n y ln( j exp( x j0 ln y! ( y exp( x ln( y ln Negatve Bnomal Regresson Analyss Negatve Bnomal Regresson (NB: nbreg. nbreg bed pcrev nsal fexp rural pn pf nf, exp(tdays d(c Fttng Posson model: Iteraton 0: log lkelhood = -264.43404 Iteraton 4: log lkelhood = -223.70024 Negatve bnomal regresson Number of obs = 52 LR ch2(7 = 4.50 Dsperson = constant Prob > ch2 = 0.0430 Log lkelhood = -223.70024 Pseudo R2 = 0.034 -------- bed pcrev -.377338.38082-2.30 0.02 -.588368 -.0470996 nsal.263429.928847 0.28 0.777 -.555796 2.082622 fexp.34574.5563743 2.42 0.06.2552406 2.43688 rural -.6644.0692708 -.68 0.092 -.2524096.09268 pn.237402.285326 0.83 0.405 -.328002.7966045 pf.62885.493737.27 0.203 -.339529.595892 nf -4.03638.836357-2.20 0.028-7.63083 -.4324443 _cons -.9878807.22439-4.65 0.000 -.404204 -.575572 tdays (exposure /lndelta.04998.2637996.497960.532035 delta 2.759357.727973.64536 4.627587 -------- Lkelhood-rato test of delta=0: chbar2(0 = 8.44 Prob>=chbar2 = 0.000 Negatve Bnomal Regresson Analyss Negatve Bnomal Regresson (NB2: glm,(nb. glm bed pcrev nsal fexp rural pn pf nf, exp(tdays f(nb l(log Iteraton 0: log lkelhood = -284.6605 Iteraton : log lkelhood = -284.6569 Iteraton 2: log lkelhood = -284.6569 Generalzed lnear models No. of obs = 52 Optmzaton : ML Resdual df = 44 Scale parameter = Devance = 2.29059843 (/df Devance =.0504332 Pearson = 2.46980 (/df Pearson =.0559363 Varance functon: V(u = u+(u^2 [Neg. Bnomal] Lnk functon : g(u = ln(u [Log] AIC =.2560 Log lkelhood = -284.656904 BIC = -7.6357 -------- OIM bed pcrev -.397257.769886-0.52 0.606 -.90656.724 nsal.33 4.408084 0.03 0.976-8.506575 8.772797 fexp.35075 2.449433 0.55 0.58-3.450627 6.50976 rural -.59449.3485077-0.33 0.739 -.7990075.56776 pn.336789.44370 0.23 0.86-2.492884 3.6632 pf.78823 2.50264 0.3 0.754-4.3904 5.7085 nf -4.5327 9.898894-0.46 0.647-23.9349 4.86876 _cons -.885872.954872-0.93 0.354-2.757387.9856429 tdays (exposure -------- 3

Problem of Zero n Counts Model Problem of Zero counts Count response models havng for more zeros than expected by dstrbutonal assumptons of Posson and Negatve bnomal models result ncorrect & based. Incorrect parameter estmates Based standard Error. Cause of Overdsperson Zero Inflated Posson Regresson Model Zero Inflated Posson (ZIP Zero-nflated count models were frst ntroduced by Lambert (992 to provde another method of accountng for excessve zero counts. ZIP are two-part models, consstng of both bnary and count model sectons. (provde for the modelng of zero counts usng both bnary and count processes. Let the response Y denote a non-negatve nteger count for the th observaton, =,,N. Zero Inflated Posson Model Probablty of Zero Inflated Posson The probablty of an excess zero s denoted by π, 0, the random varable Y follows a ZIP dstrbuton f E Y ( e, y 0 Pr( Y y y e (, y,2,,! 0 Var Y 2 ( ( ; ( Zero Inflated Negatve Bnomal Model Zero Inflated Negatve Bnomal (ZINB Let the response Y denote a non-negatve nteger count for the th observaton, =,,N. then ZINB dstrbuton k ( (, y 0 k Pr( Y y y k ( k y k (, ( (! k y k k 0 E(Y = ( λ and Var(Y = ( λ (+(κ+ λ, where κ s an overdsperson parameter y 0 Zero Inflated Negatve Bnomal Model Zero Inflated Negatve Bnomal (ZINB Example: Synthetc NB2 data :STATA (Hlbe,20. tab y -> tabulaton of y y Freq. Percent Cum. 0 20,596 4.9 4.9 2,657 25.3 66.5 2 7,26 4.25 80.76 3 4,02 8.02 88.78 4 2,270 4.54 93.32 5,335 2.67 95.99 6 78.56 97.55 7 479 0.96 98.5 8 278 0.56 99.07 9 75 0.35 99.42 0 06 0.2 99.63 60 0.2 99.75 2 38 0.08 99.83 3 29 0.06 99.88 4 20 0.04 99.92 5 5 0.03 99.95 24 0.00 00.00 Total 50,000 00.00 ZIP & ZINB Model ZIP & ZINB: example Example: Synthetc NB2 data :STATA (Hlbe,20. tab y -> tabulaton of y y Freq. Percent Cum. 0 20,596 4.9 4.9 2,657 25.3 66.5 2 7,26 4.25 80.76 3 4,02 8.02 88.78 4 2,270 4.54 93.32 5,335 2.67 95.99 6 78.56 97.55 7 479 0.96 98.5 8 278 0.56 99.07 9 75 0.35 99.42 0 06 0.2 99.63 60 0.2 99.75 2 38 0.08 99.83 3 29 0.06 99.88 4 20 0.04 99.92 5 5 0.03 99.95 24 0.00 00.00 Total 50,000 00.00 4

Zero Inflated Posson Model Zero Inflated Posson Example: zp. zp y x x2, nflate(x x2 Fttng constant-only model: Iteraton 0: log lkelhood = -9379.43 Iteraton 4: log lkelhood = -84524.083 Fttng full model: Iteraton 0: log lkelhood = -84524.083 Iteraton 4: log lkelhood = -8687.54 Zero-nflated Posson regresson Number of obs = 50000 Inflaton model = logt LR ch2(2 = 5673.4 Log lkelhood = -8687.5 Prob > ch2 = 0.0000 -------- y x.6277332.060432 39.3 0.000.596289.659774 x2 -.069268.06965-63.04 0.000 -.0252 -.036024 _cons.806343.02022 67.43 0.000.7870732.834954 nflate x -.455536.0487874-9.33 0.000 -.550775 -.359532 x2.70857.049855 4.27 0.000.6325.8084883 _cons -.036955.036488-28.42 0.000 -.0847 -.9654402 Zero Inflated Negatve Bnomal Model Zero Inflated Negatve Bnomal Example: znb. znb y x x2, nflate(x x2 Zero-nflated negatve bnomal regresson Number of obs = 50000 Inflaton model = logt LR ch2(2 = 3733.39 Log lkelhood = -78723.3 Prob > ch2 = 0.0000 -------- y x.737282.023253 3.75 0.000.696272.7826293 x2 -.254607.023088-54.35 0.000 -.299847 -.209368 _cons.508007.066787 30.63 0.000.478.5434904 nflate x -4.334255 3.705392 -.7 0.242 -.59669 2.92879 x2 3.058956 2.03922.50 0.34 -.9378257 7.055738 _cons -5.402738.82935-2.97 0.003-8.973665 -.838 /lnalpha -.29568.08339-5.90 0.000 -.3274608 -.2555728 alpha.747295.03707.720756.7744728 -------- Zero Inflated Posson Regresson Model Zero nflated Posson Model (ZIP: Interpretaton Interpretaton based on Posson Model Posson Model, contans coeffcents for the factor change n expected count for those n the Not Always Zero group. constant. The coeffcents can be nterpreted n the same way as coeffcent from the Posson Regresson Model. Interpretaton based on Bnary Logt Model Bnary Logt Model, contans coeffcents for the factor change n the odds of beng n the Always Zero group compared wth the Not Always Zero group. The coeffcents nterpreted n the same way as coeffcents for a bnary logt model Zero Inflated Negatve Bnomal Model Zero nflated Negatve Bnomal Model (ZINB: Interpretaton Interpretaton based on Negatve Bnomal Model NB Model, contans coeffcents for the factor change n expected count for those n the Not Always Zero group. The coeffcents can be nterpreted n the same way as coeffcent from the Negatve Bnomal Model. Interpretaton based on Bnary Logt Model Bnary Logt Model, contans coeffcents for the factor change n the odds of beng n the Always Zero group compared wth the Not Always Zero group. The coeffcents nterpreted n the same way as coeffcents for a bnary logt model Zero Inflated Posson Regresson Model Zero nflated Posson Model (ZIP: Example Interpretaton. zp y x x2, nflate(x x2. lstcoef, help zp (N=50000: Factor Change n Expected Count Observed SD:.8759835 Count Equaton: Factor Change n Expected Count for Those Not Always 0 y b z P> z e^b e^bstdx SDofX x 0.62773 39.28 0.000.8734.990 0.2892 x2 -.06927-63.04 0.000 0.3433 0.7336 0.2898 b = raw coeffcent e^b = exp(b = factor change n expected count for unt ncrease n X e^bstdx = exp(b*sd of X = change n expected count for SD ncrease n X SDofX = standard devaton of X Bnary Equaton: Factor Change n Odds of Always 0 Always0 b z P> z e^b e^bstdx SDofX x -0.4555-9.329 0.000 0.6344 0.8767 0.2892 x2 0.7085 4.270 0.000 2.0357.2287 0.2898 b = raw coeffcent e^b = exp(b = factor change n odds for unt ncrease n X e^bstdx = exp(b*sd of X = change n odds for SD ncrease n X SDofX = standard devaton of X Zero Inflated Posson Regresson Model Zero nflated Posson Model (ZIP: Example Interpretaton. lstcoef, help percent zp (N=50000: Percentage Change n Expected Count Observed SD:.8759835 Count Equaton: Percentage Change n Expected Count for Those Not Always 0 y b z P> z % %StdX SDofX x 0.62773 39.28 0.000 87.3 9.9 0.2892 x2 -.06927-63.04 0.000-65.7-26.6 0.2898 b = raw coeffcent % = percent change n expected count for unt ncrease n X %StdX = percent change n expected count for SD ncrease n X SDofX = standard devaton of X Bnary Equaton: Factor Change n Odds of Always 0 Always0 b z P> z % %StdX SDofX x -0.4555-9.329 0.000-36.6-2.3 0.2892 x2 0.7085 4.270 0.000 03.6 22.9 0.2898 b = raw coeffcent % = percent change n odds for unt ncrease n X %StdX = percent change n odds for SD ncrease n X SDofX = standard devaton of X 5

Zero Inflated Negatve Bnomal Model Zero Inflated Negatve Bnomal Model (ZINB: Example Interpretaton. znb y x x2, nflate(x x2. lstcoef, help znb (N=50000: Factor Change n Expected Count Observed SD:.8759835 Count Equaton: Factor Change n Expected Count for Those Not Always 0 y b z P> z e^b e^bstdx SDofX x 0.7373 3.752 0.000 2.0899.2376 0.2892 x2 -.2546-54.355 0.000 0.2852 0.6952 0.2898 ln alpha -0.2952 alpha 0.7473 SE(alpha = 0.0370 b = raw coeffcent e^b = exp(b = factor change n expected count for unt ncrease n X e^bstdx = exp(b*sd of X = change n expected count for SD ncrease n X SDofX = standard devaton of X Bnary Equaton: Factor Change n Odds of Always 0 Always0 b z P> z e^b e^bstdx SDofX x -4.33425 -.70 0.242 0.03 0.2855 0.2892 x2 3.05896.500 0.34 2.3053 2.4263 0.2898 b = raw coeffcent e^b = exp(b = factor change n odds for unt ncrease n X e^bstdx = exp(b*sd of X = change n odds for SD ncrease n X SDofX = standard devaton of X Zero Inflated Negatve Bnomal Model Zero Inflated Negatve Bnomal Model (ZINB: Example Interpretaton. znb y x x2, nflate(x x2. lstcoef, help percent znb (N=50000: Percentage Change n Expected Count Observed SD:.8759835 Count Equaton: Percentage Change n Expected Count for Those Not Always 0 y b z P> z % %StdX SDofX x 0.7373 3.752 0.000 09.0 23.8 0.2892 x2 -.2546-54.355 0.000-7.5-30.5 0.2898 ln alpha -0.2952 alpha 0.7473 SE(alpha = 0.0370 b = raw coeffcent % = percent change n expected count for unt ncrease n X %StdX = percent change n expected count for SD ncrease n X SDofX = standard devaton of X Bnary Equaton: Factor Change n Odds of Always 0 Always0 b z P> z % %StdX SDofX x -4.33425 -.70 0.242-98.7-7.4 0.2892 x2 3.05896.500 0.34 2030.5 42.6 0.2898 b = raw coeffcent % = percent change n odds for unt ncrease n X %StdX = percent change n odds for SD ncrease n X SDofX = standard devaton of X Test of Comparatve Ft Test comparatve: Vuong test The standard ft test for ZINB s the Vuong test (Vuong, 989 - Comparatve of Standard Posson & ZIP - Comparatve of ZINB & ZIP nu P ; ln V u SD( u P u the mean & SD( u standard devaton ZIP ZINP ( y x ( y x Test of Comparatve ft Comparatve test: Zero Inflated Posson VS ZIP. zp y x x2, nflate(x x2 vuong Fttng constant-only model: Zero-nflated Posson regresson Number of obs = 50000 Inflaton model = logt LR ch2(2 = 5673.4 Log lkelhood = -8687.5 Prob > ch2 = 0.0000 -------- y x.6277332.060432 39.3 0.000.596289.659774 x2 -.069268.06965-63.04 0.000 -.0252 -.036024 _cons.806343.02022 67.43 0.000.7870732.834954 nflate x -.455536.0487874-9.33 0.000 -.550775 -.359532 x2.70857.049855 4.27 0.000.6325.8084883 _cons -.036955.036488-28.42 0.000 -.0847 -.9654402 -------- Vuong test of zp vs. standard Posson: z = 39.0 Pr>z = 0.0000 Test of Comparatve ft Comparatve test: Zero Inflated Negatve Bnomal VS NB. znb y x x2, nflate(x x2 vuong zp Zero-nflated negatve bnomal regresson Number of obs = 50000 Inflaton model = logt LR ch2(2 = 3733.39 Log lkelhood = -78723.3 Prob > ch2 = 0.0000 -------- y x.737282.023253 3.75 0.000.696272.7826293 x2 -.254607.023088-54.35 0.000 -.299847 -.209368 _cons.508007.066787 30.63 0.000.478.5434904 nflate x -4.334255 3.705392 -.7 0.242 -.59669 2.92879 x2 3.058956 2.03922.50 0.34 -.9378257 7.055738 _cons -5.402738.82935-2.97 0.003-8.973665 -.838 /lnalpha -.29568.08339-5.90 0.000 -.3274608 -.2555728 alpha.747295.03707.720756.7744728 -------- Lkelhood-rato test of alpha=0: chbar2(0 = 5928.42 Pr>=chbar2 = 0.0000 Vuong test of znb vs. standard negatve bnomal: z = 0.86 Pr>z = 0.954 Comparson of Models Comparson model: Graph & statstcs across models Summary statstcs across models: BIC, AIC, lkelhood Rato Test, Voung test Graph Dfference between the observed and predcted probablty for the PRM, NB2, ZIP & ZINB models (Long & Freese, 2006 6

Comparson of Models Comparson model: countft (Graph & statstcs across models Summary statstcs across models: BIC, AIC, lkelhood Rato Test, Voung test Graph Dfference between the observed and predcted probablty for the PRM, NB2, ZIP & ZINB models. countft y x x2, gen(base_ nflate(x x2 maxcount(0 /// prm nbreg zp znb nodash Comparson of Mean Observed and Predcted Count Maxmum At Mean Model Dfference Value Dff --------------------------------------------- Base_PRM 0.24 0 0.029 Base_NBRM -0.04 2 0.005 Base_ZIP 0.069 0.06 Base_ZINB -0.04 2 0.005 Tests and Ft Statstcs Comparson of Models Comparson model: countft (Graph & statstcs across models Tests and Ft Statstcs Base_PRM BIC= -3.572 AIC= 3.566 Prefer Over Evdence --- vs Base_NBRM BIC= -466.037 df= 54.465 NBRM PRM Very strong AIC= 3.249 df= 0.37 NBRM PRM LRX2= 60.680 prob= 0.000 NBRM PRM p=0.000 --- vs Base_ZIP BIC= -387.037 df= 75.466 ZIP PRM Very strong AIC= 3.390 df= 0.76 ZIP PRM Vuong= 3.963 prob= 0.000 ZIP PRM p=0.000 --- vs Base_ZINB BIC= -448.970 df= 37.399 ZINB PRM Very strong AIC= 3.258 df= 0.309 ZINB PRM --- Base_NBRM BIC= -466.037 AIC= 3.249 Prefer Over Evdence --- vs Base_ZIP BIC= -387.037 df= -78.999 NBRM ZIP Very strong AIC= 3.390 df= -0.4 NBRM ZIP --- vs Base_ZINB BIC= -448.970 df= -7.067 NBRM ZINB Very strong AIC= 3.258 df= -0.009 NBRM ZINB Vuong= 0.520 prob= 0.302 ZINB NBRM p=0.302 --- Base_ZIP BIC= -387.037 AIC= 3.390 Prefer Over Evdence --- vs Base_ZINB BIC= -448.970 df= 6.933 ZINB ZIP Very strong AIC= 3.258 df= 0.32 ZINB ZIP LRX2= 68.47 prob= 0.000 ZINB ZIP p=0.000 --- Comparson of Models Comparson model: countft (Graph & statstcs across models Comparson of Models Comparson model: znb (Voung test.znb y x x2, nflate(x x2 vuong zp Fttng zp model: Zero-nflated negatve bnomal regresson Number of obs = 500 Nonzero obs = 304 Zero obs = 96 Inflaton model = logt LR ch2(2 = 4.68 Log lkelhood = -807.458 Prob > ch2 = 0.0000 -------- y x.7905583.924543 4. 0.000.433548.67762 x2 -.35228.952302-6.93 0.000 -.734862 -.9695734 _cons.567929.38553 4.0 0.000.296370.8394882 nflate x 24.426 23.66368.02 0.308-22.23736 70.52257 x2-8.0773 9.978-0.94 0.346-55.70292 9.54865 _cons -23.0625 22.25758 -.04 0.299-66.7303 20.578 /lnalpha -.3324529.44562-2.30 0.02 -.656994 -.0492064 alpha.77625.03646.5402629.959846 -------- Lkelhood-rato test of alpha=0: chbar2(0 = 68.5 Pr>=chbar2 = 0.0000 Vuong test of znb vs. standard negatve bnomal: z = 0.52 Pr>z = 0.306 Other Count Data Models Zero& others Count data Model Zero truncated Posson & Zero truncated negatve bnomal Truncated Posson & truncated negatve bnomal Hurdle model (Mullahy, 986 or zero-altered model (zap & zanb Censored Posson & censored negatve bnomal Generalzed Posson Regresson Generalzed Negatve Bnomal etc Reference Reference: Negatve Bnomal & other Count Models Agrest, A. (2002. Categorcal Data Analyss. John Wley & Sons. New York. Cameron A.C. and Trved P.K. (990. Regresson Analyss of Count Data. Cambrdge Unversty Press. New York. Cameron, A.C. and Trved, P.K. (990. Regresson-based tests for overdsperson n the Posson model. J.Econometrcs, 46, 347-364. Dean, C. B. (992. Testng for overdsperson n Posson and bnomal regresson models. J. Am. Statst. Assoc.,87, 45-457. Dean, C. and Lawless, J. F. (989. Tests for detectng overdsperson n Posson Regresson models. J. Am. Statst. Assoc., 84, 467-472. Fless, J.L., Levn, B., & Pak, M.C. (2003. Statstcal methods for rates and proportons. 3 rd edton. John Wley & Sons. New York. Greene, W.H. (2003. Econometrc Analyss 5th. Prentce & Hall. New Jersey. Hlbe, J.M. (2007. Negatve Bnomal Regresson. Cambrdge Unversty Press. New York. Thanomseng, N. (2007. overtest.ado STATA ado fle: Overdsperson test. Avalable at http://home.kku.ac.th/nkom 7