I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
|
|
- Kelley McKenzie
- 5 years ago
- Views:
Transcription
1 Modeling Counts & ZIP: Extended Example Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Modeling Counts Slide 1 of 36
2 Outline Outline A little exploratory analysis. Revised models Zero inflated models (something new) Modeling Counts Slide 2 of 36
3 The data are from: Espelage, D.L., Holt, M.K., & Henkel, R.R. (2004). Examination of peer-group contextual effects on aggression during early adolescence. Child Development, 74, Two ways to measure bullying Self Report: 9 item Illinois Bully Scale (Espelage & Holt, 2001). Peer nominations: Kids list everyone who they view as a bully. The total number of nominations a child receives is a measure of bullying that child s bullying. Peer nominations more objective than self report and it s getting harder to obtain IRB approval of peer nominations. Model peer nominations (a count) with self report measure (bully scale) as a predictor variable.... ignoring clustering... Modeling Counts Slide 3 of 36
4 The Predictor Variable The Predictor Variable SAS for these SAS for Distribution of Peer Nominations SAS for Bully Scale Relationship Between the Measures We Know So Far That... Modeling Counts Slide 4 of 36
5 The Predictor Variable The Predictor Variable The Predictor Variable SAS for these SAS for Distribution of Peer Nominations SAS for Bully Scale Relationship Between the Measures We Know So Far That... Modeling Counts Slide 5 of 36
6 The Predictor Variable The Predictor Variable The Predictor Variable SAS for these SAS for Distribution of Peer Nominations SAS for Bully Scale Relationship Between the Measures We Know So Far That... Modeling Counts Slide 6 of 36
7 SAS for these The Predictor Variable The Predictor Variable SAS for these SAS for Distribution of Peer Nominations SAS for Bully Scale Relationship Between the Measures We Know So Far That... data bullynom; input BULLYSC BULLYNM; datalines; run; Modeling Counts Slide 7 of 36
8 SAS for Distribution of Peer Nominations The Predictor Variable The Predictor Variable SAS for these SAS for Distribution of Peer Nominations SAS for Bully Scale Relationship Between the Measures We Know So Far That... options nogstyle; ods select Quantiles MyHist; proc univariate data=bullynom; var BULLYNM; histogram BULLYNM/ cfill=ltgray midpoints = name= MyHist ; inset n= Sample Size mean= Mean std= Standard Deviation / position=ne; Title Distribution of Number of Bully Nominations ; run; options gstyle; Modeling Counts Slide 8 of 36
9 SAS for Bully Scale The Predictor Variable The Predictor Variable SAS for these SAS for Distribution of Peer Nominations SAS for Bully Scale Relationship Between the Measures We Know So Far That... options nogstyle; ods select Quantiles BullyScaleHist; proc univariate data=bullynom; var bullysc; histogram bullysc / lognormal gamma cfill=ltgray name= BullyScaleHist ; inset n= Sample Size mean= Mean std= Standard Deviation min= Mimimumn max= Maximum / position=ne; inset lognormal gamma / position=e; Title Distribtuion of the Self Report Scale of Bullyness ; run; options gstyle; Modeling Counts Slide 9 of 36
10 Relationship Between the Measures The Predictor Variable The Predictor Variable SAS for these SAS for Distribution of Peer Nominations SAS for Bully Scale Relationship Between the Measures We Know So Far That... Modeling Counts Slide 10 of 36
11 We Know So Far That... The Predictor Variable The Predictor Variable SAS for these SAS for Distribution of Peer Nominations SAS for Bully Scale Relationship Between the Measures We Know So Far That... Both variables are highly positively skewed. There are a lot of kids who did not receive any peer nominations. There does appear to be a relationship between peer nominations and scale score. Mean peer nominations is much smaller than the variance: 2.49 < Modeling Counts Slide 11 of 36
12 Fit of Poisson Regression Model Fit of Marginal Distribution Starting Model: Random Component: Y ij = the number of nominations received by kid i in peer group j. Poisson distribution. Linear Predictor: β 0 + β 1 (bullysc) ij = β 0 + β 1 x ij The Link is the Log, the canonical link. The initial models is a standard Poisson regression model E(Y ij ) = µ ij = exp[β 0 + β 1 x ij ] where P(Y ij = y) = e µ ij µ y ij y ij! Modeling Counts Slide 12 of 36
13 Fit of Poisson Regression Model (model fit and then grouped to look at fit) Fit of Poisson Regression Model Fit of Marginal Distribution Modeling Counts Slide 13 of 36
14 Fit of Marginal Distribution Fit of Poisson Regression Model Fit of Marginal Distribution Modeling Counts Slide 14 of 36
15 Fit Statistics & Parameter Fit of Negative Binomial Model to Data Fit of Marginal Distribution Change of the Link Function Fit Statistics & Parameter Fit of Negative Binomial Model w/ Identity Fit of Marginal Distribution w/ Identity To deal with the overdispersion, we ll change the random component to Negative Binomial. The GLM model has Random= Negative Binominal Linear predictor= β 0 + β 1 x ij. log link Our next model is Y ij = µ ij ǫ ij = exp[β 0 + β 1 x ij ] ǫ ij }{{}}{{} Poisson Gamma where E(ǫ ij ) = 1 var(ǫ ij ) = 1/φ (φ is the dispersion parameter). E(Y ij x ij ) = µ ij = exp[β 0 + β 1 x ij ] var(y ij x ij ) = µ ij + µ 2 ij /φ and P(Y ij = y) = Γ(y + φ) y!γ(φ) ( φ φ + µ ij ) φ ( µij φ + µ ij ) y Modeling Counts Slide 15 of 36
16 Fit Statistics & Parameter df = 289 for all of these Dist Link G 2 X 2 X 2 /df AIC BIC Poisson log NegBin log Fit Statistics & Parameter Fit of Negative Binomial Model to Data Fit of Marginal Distribution Change of the Link Function Fit Statistics & Parameter Fit of Negative Binomial Model w/ Identity Fit of Marginal Distribution w/ Identity Poisson Negative Binomial Parm est. se Wald p est. se Wald p β < <.01 β < <.01 1/φ For interpretation, exp(.81) = 2.25 and exp(1.09) = 2.98 Modeling Counts Slide 16 of 36
17 Fit of Negative Binomial Model to Data Fit Statistics & Parameter Fit of Negative Binomial Model to Data Fit of Marginal Distribution Change of the Link Function Fit Statistics & Parameter Fit of Negative Binomial Model w/ Identity Fit of Marginal Distribution w/ Identity Modeling Counts Slide 17 of 36
18 Fit of Marginal Distribution Fit Statistics & Parameter Fit of Negative Binomial Model to Data Fit of Marginal Distribution Change of the Link Function Fit Statistics & Parameter Fit of Negative Binomial Model w/ Identity Fit of Marginal Distribution w/ Identity Modeling Counts Slide 18 of 36
19 Change of the Link Function The relationship between Y ij and x ij looks like a straight line... The New GLM: Fit Statistics & Parameter Fit of Negative Binomial Model to Data Fit of Marginal Distribution Change of the Link Function Fit Statistics & Parameter Fit of Negative Binomial Model w/ Identity Fit of Marginal Distribution w/ Identity Negative Binomial β 0 + β 1 x ij Identity Link function This model is E(ǫ ij ) = 1 var(ǫ ij ) = 1/φ Y ij = µ ij ǫ ij = (β 0 + β 1 x ij ) }{{} ǫ ij }{{} Poisson Gamma E(Y ij x ij ) = µ ij = β 0 + β 1 x ij and P(Y ij = y) = Γ(y + φ) y!γ(φ) ( φ φ + µ ij ) φ ( µij φ + µ ij ) y Modeling Counts Slide 19 of 36
20 Fit Statistics & Parameter df = 289 for all of these Fit Statistics & Parameter Fit of Negative Binomial Model to Data Fit of Marginal Distribution Change of the Link Function Fit Statistics & Parameter Fit of Negative Binomial Model w/ Identity Fit of Marginal Distribution w/ Identity Dist Link G 2 X 2 X 2 /df AIC BIC Poisson log NegBin log NegBin Identity Log Link Identity Link Parm est. se Wald p est. se Wald p β < <.01 β < <.01 1/φ For interpretation, a one unit chance in bully scale leads to exp(1.09) = 2.98 times larger or 3.07 more nominations Modeling Counts Slide 20 of 36
21 Fit of Negative Binomial Model w/ Identity Fit Statistics & Parameter Fit of Negative Binomial Model to Data Fit of Marginal Distribution Change of the Link Function Fit Statistics & Parameter Fit of Negative Binomial Model w/ Identity Fit of Marginal Distribution w/ Identity Modeling Counts Slide 21 of 36
22 Fit of Marginal Distribution w/ Identity Fit Statistics & Parameter Fit of Negative Binomial Model to Data Fit of Marginal Distribution Change of the Link Function Fit Statistics & Parameter Fit of Negative Binomial Model w/ Identity Fit of Marginal Distribution w/ Identity Modeling Counts Slide 22 of 36
23 The bully scale can reasonably be used lieu of the peer nominations. Support from this comes from The similarity of the marginal distributions for the two measures (both positively skewed). Goodness of fit of the negative binomial regression with identity link function. Qualifications (i.e., more to be done): Add in other variables known to be related to bullying (e.g., gender) to try to account for extra variability (i.e, systematic vs random). More modeling that takes into account peer groupings (i.e., see whether there are errors or systematic differences between peer groups). Modeling Counts Slide 23 of 36
24 Models for situations where there might be two underlying types or groups: one group that follows the regression model and the other that just gives 0 s. Recommended supplemental reading: Basic Zero Inflated Model (e.g., ZIP ) ZIP Model (continued) ZIP and Bully Nominations Extending the ZIP ZIP Model Parameter ZIP w/ logit model and Bully Nominations Mix and Match Long, J.S. (1997). Regression Models for Categorical and Limited Dependent Variables. Donald Erdman, Laura Jackson, Arthur Sinko (2008). Zero-Inflated Poisson and Zero-Inflated Negative Binomial Models Using the COUNTREG Procedure (Paper ). SAS Institute Inc., Cary, NC. PROC COUNTREG is in SAS v9.2 Skrondal, A., & Rabe-Hesketh, S. (2004). Generalized Latent Variable Modeling: Multilevel, Longitudinal and Structural Equation Models. NY: Chapman & Hall Modeling Counts Slide 24 of 36
25 Basic Zero Inflated Model (e.g., ZIP ) Basic Zero Inflated Model (e.g., ZIP ) ZIP Model (continued) ZIP and Bully Nominations Extending the ZIP ZIP Model Parameter ZIP w/ logit model and Bully Nominations Mix and Match The basic model is essentially a latent class type model of the form { π + (1 π)p(0 x ij ) for y = 0 P(Y ij = y x ij ) = (1 π)p(y x ij ) for y > 0 where π = the probability of being in the zero only type or class. P(0 x ij ) and P(y x ij ) are based on some model, such as Poisson or Negative Binomial regression. ZIP model is a zero inflated Poisson usually with a log link: { π + (1 π) exp( µij ) for y = 0 P(Y ij = y x ij ) = (1 π) exp( µ ij)µ y ij y! for y > 0 Modeling Counts Slide 25 of 36
26 ZIP Model (continued) Mean P(Y ij = y x ij ) = { π + (1 π) exp( µij ) for y = 0 (1 π) exp( µ ij)µ y ij y! for y > 0 Basic Zero Inflated Model (e.g., ZIP ) ZIP Model (continued) ZIP and Bully Nominations Extending the ZIP ZIP Model Parameter ZIP w/ logit model and Bully Nominations Mix and Match Variance: E(Y ij x ij ) = (0 π) + µ ij (1 π) = µ ij µ ij π var(y ij x ij ) = µ ij (1 π)(1 + µ ij π) Note that if π = 0, we simply have a standard Poisson regression with log link. Extending the ZIP model by noting that class membership is dichotomous, so we can do a logistic regression (or other model for binary data) on the probability of class membership, ( e.g., a logit ) model, πij log = γ o + γ 1 z 1ij γ q z qij 1 π ij Modeling Counts Slide 26 of 36
27 ZIP and Bully Nominations Basic Zero Inflated Model (e.g., ZIP ) ZIP Model (continued) ZIP and Bully Nominations Extending the ZIP ZIP Model Parameter ZIP w/ logit model and Bully Nominations Mix and Match Modeling Counts Slide 27 of 36
28 Extending the ZIP Since class membership is dichotomous, so we can do a logistic regression (or other model for binary data) on the probability of class membership For example, log ( πi 1 π i ) = γ o + γ 1 z 1i γ q z qi Basic Zero Inflated Model (e.g., ZIP ) ZIP Model (continued) ZIP and Bully Nominations Extending the ZIP ZIP Model Parameter ZIP w/ logit model and Bully Nominations Mix and Match For our Bully nominations, we could try ( ) πij log = γ o + γ 1 (bully scale) ij 1 π ij A comparison of how well various ZIP models fit the data: Model Dist. link for π df G 2 X 2 AIC BIC Poi log none Poi log logit Poi Ident logit Modeling Counts Slide 28 of 36
29 ZIP Model Parameter and How to interpret them: Basic Zero Inflated Model (e.g., ZIP ) ZIP Model (continued) ZIP and Bully Nominations Extending the ZIP ZIP Model Parameter ZIP w/ logit model and Bully Nominations Mix and Match ZIP w/o model for π ZIP With Logit model for π parm est se Wald p est se Wald p β < <.01 β < <.01 γ <.01 γ <.01 ZIP w/o model for π: exp(0.60) = 1.82 and ˆπ = exp(0.21) 1 + exp(0.21) =.55 ZIP With Logit model for π: exp(0.60) = 1.82 and ˆπ = exp( (bullysc) ij) 1 + exp( (bullysc) ij ) Note that exp(.77) = Modeling Counts Slide 29 of 36
30 ZIP w/ logit model and Bully Nominations Basic Zero Inflated Model (e.g., ZIP ) ZIP Model (continued) ZIP and Bully Nominations Extending the ZIP ZIP Model Parameter ZIP w/ logit model and Bully Nominations Mix and Match Modeling Counts Slide 30 of 36
31 Comparing all Fitted Basic Zero Inflated Model (e.g., ZIP ) ZIP Model (continued) ZIP and Bully Nominations Extending the ZIP ZIP Model Parameter ZIP w/ logit model and Bully Nominations Mix and Match Modeling Counts Slide 31 of 36
32 Comparing all Fitted Basic Zero Inflated Model (e.g., ZIP ) ZIP Model (continued) ZIP and Bully Nominations Extending the ZIP ZIP Model Parameter ZIP w/ logit model and Bully Nominations Mix and Match Modeling Counts Slide 32 of 36
33 Mix and Match You can also have a zero inflated Negative Binomial model. Basic Zero Inflated Model (e.g., ZIP ) ZIP Model (continued) ZIP and Bully Nominations Extending the ZIP ZIP Model Parameter ZIP w/ logit model and Bully Nominations Mix and Match You can specify a model other than logit for the mixing probability. You can do all this as a multi-level (random effects) model. In SAS: If you use v 9.1, to fit a ZIP you have to use PROC NLMIXED or PROC GENMOD with programing statements. If you use v 9.2, a ZIP can be fit easily using PROC GENMOD. For a zero inflated Negative Binomial you can use PROC NLMIXED. v 9.2, PROC COUNTREG in ETS; however, it doesn t appear to be in the version that I have. Documentation on it can be found at default/countreg toc.htm Modeling Counts Slide 33 of 36
34 These work for v 9.1 and beyond: /* Poisson Regression */ proc genmod data=bullynom; model bullynm = bullysc / link=log dist=poi type3; output out=genmodpoi pred=fitpoi upper=uppoi lower=lopoi stdreschi=res_poi; title1 Poisson Regression ; SAS v 9.1 using NLMIXED /* Negative Binomial Regression */ proc genmod data=bullynom; model bullynm = bullysc / link=log dist=negbin type3 ; output out=genmodnb pred=nbfit upper=nbup lower=nblo stdreschi=res_negbin; title1 Negative Binomial ; Modeling Counts Slide 34 of 36
35 These work for v 9.2 and beyond: /* Zero Inflated Poission Regression w/o model for inflation probability*/ proc genmod data=bullynom; model bullynm = bullysc / link=log dist=zip type3 obstats; zeromodel / link=logit; output out=zip1 pred=zipfit1 ; title1 Zero Inflated Poisson ; SAS v 9.1 using NLMIXED /* Zero Inflated Poission Regression */ proc genmod data=bullynom; model bullynm = bullysc / link=log dist=zip type3; zeromodel bullysc / link=logit; output out=zip2 pred=zipfit2 ; title1 ZIP w/ logit model for inflation probability ; Modeling Counts Slide 35 of 36
36 SAS v 9.1 using NLMIXED SAS v 9.1 using NLMIXED proc nlmixed data=bullynom; /* Some starting values */ parm beta0= beta1= a0=1; /* linear predictor for the inflation probability */ linpinf = a0 + a1*bullysc; /* infprob = inflation probability for zeros * / /* = logistic transform of the linear predictor*/ infprob = 1/(1+exp(-linpinf)); /* Poisson mean */ mu = exp( beta0 + beta1*bullysc); /* Build the ZIP log likelihood */ if bullynm=0 then ll = log(infprob + (1-infprob)*exp(-mu)); else ll = log((1-infprob)) - mu + bullynm*log(mu) - lgamma(bullynm + 1); model bullynm general(ll); title Zero Inflated Poisson regression ; SAS demo... Modeling Counts Slide 36 of 36
Hierarchical Generalized Linear Models. Measurement Incorporated Hierarchical Linear Models Workshop
Hierarchical Generalized Linear Models Measurement Incorporated Hierarchical Linear Models Workshop Hierarchical Generalized Linear Models So now we are moving on to the more advanced type topics. To begin
More informationproc genmod; model malform/total = alcohol / dist=bin link=identity obstats; title 'Table 2.7'; title2 'Identity Link';
BIOS 6244 Analysis of Categorical Data Assignment 5 s 1. Consider Exercise 4.4, p. 98. (i) Write the SAS code, including the DATA step, to fit the linear probability model and the logit model to the data
More informationNegative Binomial Model for Count Data Log-linear Models for Contingency Tables - Introduction
Negative Binomial Model for Count Data Log-linear Models for Contingency Tables - Introduction Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Negative Binomial Family Example: Absenteeism from
More informationMultiple Regression and Logistic Regression II. Dajiang 525 Apr
Multiple Regression and Logistic Regression II Dajiang Liu @PHS 525 Apr-19-2016 Materials from Last Time Multiple regression model: Include multiple predictors in the model = + + + + How to interpret the
More informationbook 2014/5/6 15:21 page 261 #285
book 2014/5/6 15:21 page 261 #285 Chapter 10 Simulation Simulations provide a powerful way to answer questions and explore properties of statistical estimators and procedures. In this chapter, we will
More informationEstimation Parameters and Modelling Zero Inflated Negative Binomial
CAUCHY JURNAL MATEMATIKA MURNI DAN APLIKASI Volume 4(3) (2016), Pages 115-119 Estimation Parameters and Modelling Zero Inflated Negative Binomial Cindy Cahyaning Astuti 1, Angga Dwi Mulyanto 2 1 Muhammadiyah
More informationGeneralized Multilevel Regression Example for a Binary Outcome
Psy 510/610 Multilevel Regression, Spring 2017 1 HLM Generalized Multilevel Regression Example for a Binary Outcome Specifications for this Bernoulli HLM2 run Problem Title: no title The data source for
More informationLogit Models for Binary Data
Chapter 3 Logit Models for Binary Data We now turn our attention to regression models for dichotomous data, including logistic regression and probit analysis These models are appropriate when the response
More informationSTA 4504/5503 Sample questions for exam True-False questions.
STA 4504/5503 Sample questions for exam 2 1. True-False questions. (a) For General Social Survey data on Y = political ideology (categories liberal, moderate, conservative), X 1 = gender (1 = female, 0
More informationIntro to GLM Day 2: GLM and Maximum Likelihood
Intro to GLM Day 2: GLM and Maximum Likelihood Federico Vegetti Central European University ECPR Summer School in Methods and Techniques 1 / 32 Generalized Linear Modeling 3 steps of GLM 1. Specify the
More informationME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.
ME3620 Theory of Engineering Experimentation Chapter III. Random Variables and Probability Distributions Chapter III 1 3.2 Random Variables In an experiment, a measurement is usually denoted by a variable
More informationBayesian Multinomial Model for Ordinal Data
Bayesian Multinomial Model for Ordinal Data Overview This example illustrates how to fit a Bayesian multinomial model by using the built-in mutinomial density function (MULTINOM) in the MCMC procedure
More informationTopic 8: Model Diagnostics
Topic 8: Model Diagnostics Outline Diagnostics to check model assumptions Diagnostics concerning X Diagnostics using the residuals Diagnostics and remedial measures Diagnostics: look at the data to diagnose
More informationCredit Risk Modelling
Credit Risk Modelling Tiziano Bellini Università di Bologna December 13, 2013 Tiziano Bellini (Università di Bologna) Credit Risk Modelling December 13, 2013 1 / 55 Outline Framework Credit Risk Modelling
More informationU.S. Women s Labor Force Participation Rates, Children and Change:
INTRODUCTION Even with rising labor force participation, women are less likely to be in the formal workforce when there are very young children in their household. How the gap in these participation rates
More informationSession 5. A brief introduction to Predictive Modeling
SOA Predictive Analytics Seminar Malaysia 27 Aug. 2018 Kuala Lumpur, Malaysia Session 5 A brief introduction to Predictive Modeling Lichen Bao, Ph.D A Brief Introduction to Predictive Modeling LICHEN BAO
More informationLecture 21: Logit Models for Multinomial Responses Continued
Lecture 21: Logit Models for Multinomial Responses Continued Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University
More informationis the bandwidth and controls the level of smoothing of the estimator, n is the sample size and
Paper PH100 Relationship between Total charges and Reimbursements in Outpatient Visits Using SAS GLIMMIX Chakib Battioui, University of Louisville, Louisville, KY ABSTRACT The purpose of this paper is
More informationClustered Binary Logistic Regression in Teratology Data
Clustered Binary Logistic Regression in Teratology Data Jorge G. Morel, Ph.D. Adjunct Professor University of Maryland Baltimore County Division of Biostatistics and Epidemiology Cincinnati Children s
More informationSubject CS1 Actuarial Statistics 1 Core Principles. Syllabus. for the 2019 exams. 1 June 2018
` Subject CS1 Actuarial Statistics 1 Core Principles Syllabus for the 2019 exams 1 June 2018 Copyright in this Core Reading is the property of the Institute and Faculty of Actuaries who are the sole distributors.
More informationModeling. joint work with Jed Frees, U of Wisconsin - Madison. Travelers PASG (Predictive Analytics Study Group) Seminar Tuesday, 12 April 2016
joint work with Jed Frees, U of Wisconsin - Madison Travelers PASG (Predictive Analytics Study Group) Seminar Tuesday, 12 April 2016 claim Department of Mathematics University of Connecticut Storrs, Connecticut
More informationLecture 3: Probability Distributions (cont d)
EAS31116/B9036: Statistics in Earth & Atmospheric Sciences Lecture 3: Probability Distributions (cont d) Instructor: Prof. Johnny Luo www.sci.ccny.cuny.edu/~luo Dates Topic Reading (Based on the 2 nd Edition
More informationStochastic Frontier Models with Binary Type of Output
Chapter 6 Stochastic Frontier Models with Binary Type of Output 6.1 Introduction In all the previous chapters, we have considered stochastic frontier models with continuous dependent (or output) variable.
More informationNPTEL Project. Econometric Modelling. Module 16: Qualitative Response Regression Modelling. Lecture 20: Qualitative Response Regression Modelling
1 P age NPTEL Project Econometric Modelling Vinod Gupta School of Management Module 16: Qualitative Response Regression Modelling Lecture 20: Qualitative Response Regression Modelling Rudra P. Pradhan
More informationMaximum Likelihood Estimation
Maximum Likelihood Estimation EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #6 EPSY 905: Maximum Likelihood In This Lecture The basics of maximum likelihood estimation Ø The engine that
More informationSAS/STAT 15.1 User s Guide The FMM Procedure
SAS/STAT 15.1 User s Guide The FMM Procedure This document is an individual chapter from SAS/STAT 15.1 User s Guide. The correct bibliographic citation for this manual is as follows: SAS Institute Inc.
More informationActuarial Research on the Effectiveness of Collision Avoidance Systems FCW & LDW. A translation from Hebrew to English of a research paper prepared by
Actuarial Research on the Effectiveness of Collision Avoidance Systems FCW & LDW A translation from Hebrew to English of a research paper prepared by Ron Actuarial Intelligence LTD Contact Details: Shachar
More informationGeneralized Linear Models
Generalized Linear Models Scott Creel Wednesday, September 10, 2014 This exercise extends the prior material on using the lm() function to fit an OLS regression and test hypotheses about effects on a parameter.
More informationBasic Procedure for Histograms
Basic Procedure for Histograms 1. Compute the range of observations (min. & max. value) 2. Choose an initial # of classes (most likely based on the range of values, try and find a number of classes that
More informationUsing R to Create Synthetic Discrete Response Regression Models
Arizona State University From the SelectedWorks of Joseph M Hilbe July 3, 2011 Using R to Create Synthetic Discrete Response Regression Models Joseph Hilbe, Arizona State University Available at: https://works.bepress.com/joseph_hilbe/3/
More informationDistribution of state of nature: Main problem
State of nature concept Monte Carlo Simulation II Advanced Herd Management Anders Ringgaard Kristensen The hyper distribution: An infinite population of flocks each having its own state of nature defining
More informationSome Characteristics of Data
Some Characteristics of Data Not all data is the same, and depending on some characteristics of a particular dataset, there are some limitations as to what can and cannot be done with that data. Some key
More informationLecture 10: Alternatives to OLS with limited dependent variables, part 1. PEA vs APE Logit/Probit
Lecture 10: Alternatives to OLS with limited dependent variables, part 1 PEA vs APE Logit/Probit PEA vs APE PEA: partial effect at the average The effect of some x on y for a hypothetical case with sample
More informationThe Vasicek Distribution
The Vasicek Distribution Dirk Tasche Lloyds TSB Bank Corporate Markets Rating Systems dirk.tasche@gmx.net Bristol / London, August 2008 The opinions expressed in this presentation are those of the author
More informationUtilizing the Flexibility of the Epsilon-Skew-Normal Distribution for Tobit Regression Problems
Utilizing the Flexibility of the Epsilon-Skew-Normal Distribution for Tobit Regression Problems Terry L. Mashtare Jr. Department of Biostatistics University at Buffalo, 249 Farber Hall, 3435 Main Street,
More informationsociology SO5032 Quantitative Research Methods Brendan Halpin, Sociology, University of Limerick Spring 2018 SO5032 Quantitative Research Methods
1 SO5032 Quantitative Research Methods Brendan Halpin, Sociology, University of Limerick Spring 2018 Lecture 10: Multinomial regression baseline category extension of binary What if we have multiple possible
More informationSession 5. Predictive Modeling in Life Insurance
SOA Predictive Analytics Seminar Hong Kong 29 Aug. 2018 Hong Kong Session 5 Predictive Modeling in Life Insurance Jingyi Zhang, Ph.D Predictive Modeling in Life Insurance JINGYI ZHANG PhD Scientist Global
More informationTo be two or not be two, that is a LOGISTIC question
MWSUG 2016 - Paper AA18 To be two or not be two, that is a LOGISTIC question Robert G. Downer, Grand Valley State University, Allendale, MI ABSTRACT A binary response is very common in logistic regression
More informationData Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing
Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing C. Olivia Rud, President, OptiMine Consulting, West Chester, PA ABSTRACT Data Mining is a new term for the
More informationDuangporn Jearkpaporn, Connie M. Borror Douglas C. Montgomery and George C. Runger Arizona State University Tempe, AZ
Process Monitoring for Correlated Gamma Distributed Data Using Generalized Linear Model Based Control Charts Duangporn Jearkpaporn, Connie M. Borror Douglas C. Montgomery and George C. Runger Arizona State
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models Generalized Linear Models - IIIb Henrik Madsen March 18, 2012 Henrik Madsen () Chapman & Hall March 18, 2012 1 / 32 Examples Overdispersion and Offset!
More informationMultinomial and ordinal logistic regression using PROC LOGISTIC Peter L. Flom Peter Flom Consulting, LLC
ABSTRACT Multinomial and ordinal logistic regression using PROC LOGISTIC Peter L. Flom Peter Flom Consulting, LLC Logistic regression may be useful when we are trying to model a categorical dependent variable
More informationAP Statistics Chapter 6 - Random Variables
AP Statistics Chapter 6 - Random 6.1 Discrete and Continuous Random Objective: Recognize and define discrete random variables, and construct a probability distribution table and a probability histogram
More informationOrdinal Multinomial Logistic Regression. Thom M. Suhy Southern Methodist University May14th, 2013
Ordinal Multinomial Logistic Thom M. Suhy Southern Methodist University May14th, 2013 GLM Generalized Linear Model (GLM) Framework for statistical analysis (Gelman and Hill, 2007, p. 135) Linear Continuous
More information4.3 Normal distribution
43 Normal distribution Prof Tesler Math 186 Winter 216 Prof Tesler 43 Normal distribution Math 186 / Winter 216 1 / 4 Normal distribution aka Bell curve and Gaussian distribution The normal distribution
More informationFive Things You Should Know About Quantile Regression
Five Things You Should Know About Quantile Regression Robert N. Rodriguez and Yonggang Yao SAS Institute #analyticsx Copyright 2016, SAS Institute Inc. All rights reserved. Quantile regression brings the
More informationInsights into Using the GLIMMIX Procedure to Model Categorical Outcomes with Random Effects
Paper SAS2179-2018 Insights into Using the GLIMMIX Procedure to Model Categorical Outcomes with Random Effects Kathleen Kiernan, SAS Institute Inc. ABSTRACT Modeling categorical outcomes with random effects
More informationRisk Classification In Non-Life Insurance
Risk Classification In Non-Life Insurance Katrien Antonio Jan Beirlant November 28, 2006 Abstract Within the actuarial profession a major challenge can be found in the construction of a fair tariff structure.
More informationModelling Bank Loan LGD of Corporate and SME Segment
15 th Computing in Economics and Finance, Sydney, Australia Modelling Bank Loan LGD of Corporate and SME Segment Radovan Chalupka, Juraj Kopecsni Charles University, Prague 1. introduction 2. key issues
More informationNormal populations. Lab 9: Normal approximations for means STT 421: Summer, 2004 Vince Melfi
Lab 9: Normal approximations for means STT 421: Summer, 2004 Vince Melfi In previous labs where we investigated the distribution of the sample mean and sample proportion, we often noticed that the distribution
More informationLoss Simulation Model Testing and Enhancement
Loss Simulation Model Testing and Enhancement Casualty Loss Reserve Seminar By Kailan Shang Sept. 2011 Agenda Research Overview Model Testing Real Data Model Enhancement Further Development Enterprise
More informationCommonly Used Distributions
Chapter 4: Commonly Used Distributions 1 Introduction Statistical inference involves drawing a sample from a population and analyzing the sample data to learn about the population. We often have some knowledge
More informationIntroduction to POL 217
Introduction to POL 217 Brad Jones 1 1 Department of Political Science University of California, Davis January 9, 2007 Topics of Course Outline Models for Categorical Data. Topics of Course Models for
More informationCreation of Synthetic Discrete Response Regression Models
Arizona State University From the SelectedWorks of Joseph M Hilbe 2010 Creation of Synthetic Discrete Response Regression Models Joseph Hilbe, Arizona State University Available at: https://works.bepress.com/joseph_hilbe/2/
More informationStatistics & Statistical Tests: Assumptions & Conclusions
Degrees of Freedom Statistics & Statistical Tests: Assumptions & Conclusions Kinds of degrees of freedom Kinds of Distributions Kinds of Statistics & assumptions required to perform each Normal Distributions
More informationEstimation Procedure for Parametric Survival Distribution Without Covariates
Estimation Procedure for Parametric Survival Distribution Without Covariates The maximum likelihood estimates of the parameters of commonly used survival distribution can be found by SAS. The following
More informationBy-Peril Deductible Factors
By-Peril Deductible Factors Luyang Fu, Ph.D., FCAS Jerry Han, Ph.D., ASA March 17 th 2010 State Auto is one of only 13 companies to earn an A+ Rating by AM Best every year since 1954! Agenda Introduction
More informationTHE COMPARATIVE ANALYSIS OF PREDICTIVE MODELS FOR CREDIT LIMIT UTILIZATION RATE
THE COMPARATIVE ANALYSIS OF PREDICTIVE MODELS FOR CREDIT LIMIT UTILIZATION RATE PROFESSOR JONATHAN CROOK DENYS OSIPENKO CRCCXIV, 26-28 August 215, Edinburgh Content 2 Objectives The utilization rate definitions
More informationAnd The Winner Is? How to Pick a Better Model
And The Winner Is? How to Pick a Better Model Part 2 Goodness-of-Fit and Internal Stability Dan Tevet, FCAS, MAAA Goodness-of-Fit Trying to answer question: How well does our model fit the data? Can be
More informationCARe Seminar on Reinsurance - Loss Sensitive Treaty Features. June 6, 2011 Matthew Dobrin, FCAS
CARe Seminar on Reinsurance - Loss Sensitive Treaty Features June 6, 2011 Matthew Dobrin, FCAS 2 Table of Contents Ø Overview of Loss Sensitive Treaty Features Ø Common reinsurance structures for Proportional
More informationDetermining Probability Estimates From Logistic Regression Results Vartanian: SW 541
Determining Probability Estimates From Logistic Regression Results Vartanian: SW 541 In determining logistic regression results, you will generally be given the odds ratio in the SPSS or SAS output. However,
More informationStudy 2: data analysis. Example analysis using R
Study 2: data analysis Example analysis using R Steps for data analysis Install software on your computer or locate computer with software (e.g., R, systat, SPSS) Prepare data for analysis Subjects (rows)
More informationNon-informative Priors Multiparameter Models
Non-informative Priors Multiparameter Models Statistics 220 Spring 2005 Copyright c 2005 by Mark E. Irwin Prior Types Informative vs Non-informative There has been a desire for a prior distributions that
More informationLocal Maxima in the Estimation of the ZINB and Sample Selection models
1 Local Maxima in the Estimation of the ZINB and Sample Selection models J.M.C. Santos Silva School of Economics, University of Surrey 23rd London Stata Users Group Meeting 7 September 2017 2 1. Introduction
More informationSAS/STAT 14.1 User s Guide. The HPFMM Procedure
SAS/STAT 14.1 User s Guide The HPFMM Procedure This document is an individual chapter from SAS/STAT 14.1 User s Guide. The correct bibliographic citation for this manual is as follows: SAS Institute Inc.
More informationNormal distribution Approximating binomial distribution by normal 2.10 Central Limit Theorem
1.1.2 Normal distribution 1.1.3 Approimating binomial distribution by normal 2.1 Central Limit Theorem Prof. Tesler Math 283 Fall 216 Prof. Tesler 1.1.2-3, 2.1 Normal distribution Math 283 / Fall 216 1
More informationProbits. Catalina Stefanescu, Vance W. Berger Scott Hershberger. Abstract
Probits Catalina Stefanescu, Vance W. Berger Scott Hershberger Abstract Probit models belong to the class of latent variable threshold models for analyzing binary data. They arise by assuming that the
More informationComparing Odds Ratios and Marginal Effects from Logistic Regression and Linear Probability Models
Western Kentucky University From the SelectedWorks of Matt Bogard Spring March 11, 2016 Comparing Odds Ratios and Marginal Effects from Logistic Regression and Linear Probability Models Matt Bogard Available
More informationMARGINALIZED TWO-PART MODELS FOR SEMICONTINUOUS DATA WITH APPLICATION TO MEDICAL COSTS. Valerie Anne Smith
MARGINALIZED TWO-PART MODELS FOR SEMICONTINUOUS DATA WITH APPLICATION TO MEDICAL COSTS Valerie Anne Smith A dissertation submitted to the faculty at the University of North Carolina at Chapel Hill in partial
More informationCHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES
Examples: Monte Carlo Simulation Studies CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES Monte Carlo simulation studies are often used for methodological investigations of the performance of statistical
More informationAssessment on Credit Risk of Real Estate Based on Logistic Regression Model
Assessment on Credit Risk of Real Estate Based on Logistic Regression Model Li Hongli 1, a, Song Liwei 2,b 1 Chongqing Engineering Polytechnic College, Chongqing400037, China 2 Division of Planning and
More informationCHAPTER 8 EXAMPLES: MIXTURE MODELING WITH LONGITUDINAL DATA
Examples: Mixture Modeling With Longitudinal Data CHAPTER 8 EXAMPLES: MIXTURE MODELING WITH LONGITUDINAL DATA Mixture modeling refers to modeling with categorical latent variables that represent subpopulations
More informationNon linearity issues in PD modelling. Amrita Juhi Lucas Klinkers
Non linearity issues in PD modelling Amrita Juhi Lucas Klinkers May 2017 Content Introduction Identifying non-linearity Causes of non-linearity Performance 2 Content Introduction Identifying non-linearity
More informationRandom Variables. Chapter 6: Random Variables 2/2/2014. Discrete and Continuous Random Variables. Transforming and Combining Random Variables
Chapter 6: Random Variables Section 6.3 The Practice of Statistics, 4 th edition For AP* STARNES, YATES, MOORE Random Variables 6.1 6.2 6.3 Discrete and Continuous Random Variables Transforming and Combining
More informationOutline. Review Continuation of exercises from last time
Bayesian Models II Outline Review Continuation of exercises from last time 2 Review of terms from last time Probability density function aka pdf or density Likelihood function aka likelihood Conditional
More informationQuantile Regression. By Luyang Fu, Ph. D., FCAS, State Auto Insurance Company Cheng-sheng Peter Wu, FCAS, ASA, MAAA, Deloitte Consulting
Quantile Regression By Luyang Fu, Ph. D., FCAS, State Auto Insurance Company Cheng-sheng Peter Wu, FCAS, ASA, MAAA, Deloitte Consulting Agenda Overview of Predictive Modeling for P&C Applications Quantile
More informationApplication of statistical methods in the determination of health loss distribution and health claims behaviour
Mathematical Statistics Stockholm University Application of statistical methods in the determination of health loss distribution and health claims behaviour Vasileios Keisoglou Examensarbete 2005:8 Postal
More informationLet us assume that we are measuring the yield of a crop plant on 5 different plots at 4 different observation times.
Mixed-effects models An introduction by Christoph Scherber Up to now, we have been dealing with linear models of the form where ß0 and ß1 are parameters of fixed value. Example: Let us assume that we are
More informationGLM III - The Matrix Reloaded
GLM III - The Matrix Reloaded Duncan Anderson, Serhat Guven 12 March 2013 2012 Towers Watson. All rights reserved. Agenda "Quadrant Saddles" The Tweedie Distribution "Emergent Interactions" Dispersion
More informationINSTITUTE AND FACULTY OF ACTUARIES. Curriculum 2019 SPECIMEN EXAMINATION
INSTITUTE AND FACULTY OF ACTUARIES Curriculum 2019 SPECIMEN EXAMINATION Subject CS1A Actuarial Statistics Time allowed: Three hours and fifteen minutes INSTRUCTIONS TO THE CANDIDATE 1. Enter all the candidate
More informationINDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -5 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc.
INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY Lecture -5 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc. Summary of the previous lecture Moments of a distribubon Measures of
More informationLecture Stat 302 Introduction to Probability - Slides 15
Lecture Stat 30 Introduction to Probability - Slides 15 AD March 010 AD () March 010 1 / 18 Continuous Random Variable Let X a (real-valued) continuous r.v.. It is characterized by its pdf f : R! [0, )
More informationLongitudinal Logistic Regression: Breastfeeding of Nepalese Children
Longitudinal Logistic Regression: Breastfeeding of Nepalese Children Scientific Question Determine whether the breastfeeding of Nepalese children varies with child age and/or sex of child. Data: Nepal
More informationDescription Remarks and examples References Also see
Title stata.com example 41g Two-level multinomial logistic regression (multilevel) Description Remarks and examples References Also see Description We demonstrate two-level multinomial logistic regression
More informationHomework: Due Wed, Nov 3 rd Chapter 8, # 48a, 55c and 56 (count as 1), 67a
Homework: Due Wed, Nov 3 rd Chapter 8, # 48a, 55c and 56 (count as 1), 67a Announcements: There are some office hour changes for Nov 5, 8, 9 on website Week 5 quiz begins after class today and ends at
More informationRandom Variables CHAPTER 6.3 BINOMIAL AND GEOMETRIC RANDOM VARIABLES
Random Variables CHAPTER 6.3 BINOMIAL AND GEOMETRIC RANDOM VARIABLES Essential Question How can I determine whether the conditions for using binomial random variables are met? Binomial Settings When the
More informationChapter 5: Statistical Inference (in General)
Chapter 5: Statistical Inference (in General) Shiwen Shen University of South Carolina 2016 Fall Section 003 1 / 17 Motivation In chapter 3, we learn the discrete probability distributions, including Bernoulli,
More informationThe University of Chicago, Booth School of Business Business 41202, Spring Quarter 2009, Mr. Ruey S. Tsay. Solutions to Final Exam
The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2009, Mr. Ruey S. Tsay Solutions to Final Exam Problem A: (42 pts) Answer briefly the following questions. 1. Questions
More information**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:
**BEGINNING OF EXAMINATION** 1. You are given: (i) A random sample of five observations from a population is: 0.2 0.7 0.9 1.1 1.3 (ii) You use the Kolmogorov-Smirnov test for testing the null hypothesis,
More informationCSC 411: Lecture 08: Generative Models for Classification
CSC 411: Lecture 08: Generative Models for Classification Richard Zemel, Raquel Urtasun and Sanja Fidler University of Toronto Zemel, Urtasun, Fidler (UofT) CSC 411: 08-Generative Models 1 / 23 Today Classification
More informationChapter 6: Random Variables
Chapter 6: Random Variables Section 6.3 The Practice of Statistics, 4 th edition For AP* STARNES, YATES, MOORE Chapter 6 Random Variables 6.1 Discrete and Continuous Random Variables 6.2 Transforming and
More information9. Logit and Probit Models For Dichotomous Data
Sociology 740 John Fox Lecture Notes 9. Logit and Probit Models For Dichotomous Data Copyright 2014 by John Fox Logit and Probit Models for Dichotomous Responses 1 1. Goals: I To show how models similar
More informationStatistical Models of Stocks and Bonds. Zachary D Easterling: Department of Economics. The University of Akron
Statistical Models of Stocks and Bonds Zachary D Easterling: Department of Economics The University of Akron Abstract One of the key ideas in monetary economics is that the prices of investments tend to
More informationHidden Markov Regimes in Operational Loss Data
Hidden Markov Regimes in Operational Loss Data Georges Dionne and Samir Saissi Hassani Canada Research Chair in Risk Management HEC Montréal ABA Operational Risk Modeling Forum November 2-4, 2016 The Fairmont
More informationBivariate Birnbaum-Saunders Distribution
Department of Mathematics & Statistics Indian Institute of Technology Kanpur January 2nd. 2013 Outline 1 Collaborators 2 3 Birnbaum-Saunders Distribution: Introduction & Properties 4 5 Outline 1 Collaborators
More informationObtaining Predictive Distributions for Reserves Which Incorporate Expert Opinions R. Verrall A. Estimation of Policy Liabilities
Obtaining Predictive Distributions for Reserves Which Incorporate Expert Opinions R. Verrall A. Estimation of Policy Liabilities LEARNING OBJECTIVES 5. Describe the various sources of risk and uncertainty
More informationEconometric Methods for Valuation Analysis
Econometric Methods for Valuation Analysis Margarita Genius Dept of Economics M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 1 / 25 Outline We will consider econometric
More informationStatistics & Flood Frequency Chapter 3. Dr. Philip B. Bedient
Statistics & Flood Frequency Chapter 3 Dr. Philip B. Bedient Predicting FLOODS Flood Frequency Analysis n Statistical Methods to evaluate probability exceeding a particular outcome - P (X >20,000 cfs)
More informationEconometric Models of Expenditure
Econometric Models of Expenditure Benjamin M. Craig University of Arizona ISPOR Educational Teleconference October 28, 2005 1 Outline Overview of Expenditure Estimator Selection Two problems Two mistakes
More informationHigh-Frequency Data Analysis and Market Microstructure [Tsay (2005), chapter 5]
1 High-Frequency Data Analysis and Market Microstructure [Tsay (2005), chapter 5] High-frequency data have some unique characteristics that do not appear in lower frequencies. At this class we have: Nonsynchronous
More information