############################ ### toxo.r ### ############################

Similar documents
> budworm$samplogit < log((budworm$y+0.5)/(budworm$m budworm$y+0.5))

Logistic Regression. Logistic Regression Theory

Multiple Regression and Logistic Regression II. Dajiang 525 Apr

Introduction to General and Generalized Linear Models

Bradley-Terry Models. Stat 557 Heike Hofmann

Ordinal Multinomial Logistic Regression. Thom M. Suhy Southern Methodist University May14th, 2013

boxcox() returns the values of α and their loglikelihoods,

Negative Binomial Model for Count Data Log-linear Models for Contingency Tables - Introduction

Generalized Linear Models

Logistic Regression with R: Example One

Stat 401XV Exam 3 Spring 2017

Statistics 175 Applied Statistics Generalized Linear Models Jianqing Fan

CREDIT RISK MODELING IN R. Logistic regression: introduction

Chapter 8 Exercises 1. Data Analysis & Graphics Using R Solutions to Exercises (May 1, 2010)

Logit Analysis. Using vttown.dta. Albert Satorra, UPF

Step 1: Load the appropriate R package. Step 2: Fit a separate mixed model for each independence claim in the basis set.

Using R to Create Synthetic Discrete Response Regression Models

Case Study: Applying Generalized Linear Models

Addiction - Multinomial Model

Predicting Charitable Contributions

AIC = Log likelihood = BIC =

Credit Risk Modelling

MCMC Package Example

Let us assume that we are measuring the yield of a crop plant on 5 different plots at 4 different observation times.

Projects for Bayesian Computation with R

Generalized Multilevel Regression Example for a Binary Outcome

Dummy Variables. 1. Example: Factors Affecting Monthly Earnings

Non-linearities in Simple Regression

The SAS System 11:03 Monday, November 11,

ORDERED MULTINOMIAL LOGISTIC REGRESSION ANALYSIS. Pooja Shivraj Southern Methodist University

Categorical Outcomes. Statistical Modelling in Stata: Categorical Outcomes. R by C Table: Example. Nominal Outcomes. Mark Lunt.

Determining Probability Estimates From Logistic Regression Results Vartanian: SW 541

book 2014/5/6 15:21 page 261 #285

proc genmod; model malform/total = alcohol / dist=bin link=identity obstats; title 'Table 2.7'; title2 'Identity Link';

6 Multiple Regression

Graduate School of Business, University of Chicago Business 41202, Spring Quarter 2007, Mr. Ruey S. Tsay. Midterm

2 H PLH L PLH visit trt group rel N 1 H PHL L PHL P PLH P PHL 5 16

Lecture Note: Analysis of Financial Time Series Spring 2017, Ruey S. Tsay

Final Exam - section 1. Thursday, December hours, 30 minutes

Empirical Asset Pricing for Tactical Asset Allocation

Lapse Modeling for the Post-Level Period

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2013, Mr. Ruey S. Tsay. Final Exam

Logit Models for Binary Data

Lecture 21: Logit Models for Multinomial Responses Continued

Copyright 2005 Pearson Education, Inc. Slide 6-1

COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 18, 2006, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTIONS

Lecture Note: Analysis of Financial Time Series Spring 2008, Ruey S. Tsay. Seasonal Time Series: TS with periodic patterns and useful in

Duration Models: Parametric Models

> > is.factor(scabdata$trt) [1] TRUE > is.ordered(scabdata$trt) [1] FALSE > scabdata$trtord <- ordered(scabdata$trt, +

Predicting the Direction of Swap Spreads

Graduate School of Business, University of Chicago Business 41202, Spring Quarter 2007, Mr. Ruey S. Tsay. Final Exam

MODEL SELECTION CRITERIA IN R:

Module 9: Single-level and Multilevel Models for Ordinal Responses. Stata Practical 1

Intro to GLM Day 2: GLM and Maximum Likelihood

Two-phase designs in epidemiology

Study 2: data analysis. Example analysis using R

MCMC Package Example (Version 0.5-1)

Hypothesis Tests: One Sample Mean Cal State Northridge Ψ320 Andrew Ainsworth PhD

Regression and Simulation

Monetary Economics Risk and Return, Part 2. Gerald P. Dwyer Fall 2015

Exchange Rate Regime Analysis for the Indian Rupee

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2017, Mr. Ruey S. Tsay. Final Exam

Creation of Synthetic Discrete Response Regression Models

Ordinal and categorical variables

ARIMA ANALYSIS WITH INTERVENTIONS / OUTLIERS

STA 4504/5503 Sample questions for exam True-False questions.

[BINARY DEPENDENT VARIABLE ESTIMATION WITH STATA]

Chapter 4 Level of Volatility in the Indian Stock Market

An Empirical Study on Default Factors for US Sub-prime Residential Loans

EC327: Limited Dependent Variables and Sample Selection Binomial probit: probit

WesVar uses repeated replication variance estimation methods exclusively and as a result does not offer the Taylor Series Linearization approach.

Building and Checking Survival Models

Random Effects ANOVA

Multiple regression - a brief introduction

11. Logistic modeling of proportions

Booth School of Business, University of Chicago Business 41202, Spring Quarter 2012, Mr. Ruey S. Tsay. Midterm

Longitudinal Logistic Regression: Breastfeeding of Nepalese Children

Model fit assessment via marginal model plots

Econ 371 Problem Set #4 Answer Sheet. 6.2 This question asks you to use the results from column (1) in the table on page 213.

1 Stat 8053, Fall 2011: GLMMs

Duration Models: Modeling Strategies

Occupancy models with detection error Peter Solymos and Subhash Lele July 16, 2016 Madison, WI NACCB Congress

Sociology 704: Topics in Multivariate Statistics Instructor: Natasha Sarkisian. Binary Logit

u panel_lecture . sum

Lecture 1: Empirical Properties of Returns

Maximum Likelihood Estimation

> attach(grocery) > boxplot(sales~discount, ylab="sales",xlab="discount")

Girma Tefera*, Legesse Negash and Solomon Buke. Department of Statistics, College of Natural Science, Jimma University. Ethiopia.

Diploma Part 2. Quantitative Methods. Examiner s Suggested Answers

Exchange Rate Regime Analysis for the Indian Rupee

R is a collaborative project with many contributors. Type contributors() for more information.

Sociology Exam 3 Answer Key - DRAFT May 8, 2007

Sean Howard Econometrics Final Project Paper. An Analysis of the Determinants and Factors of Physical Education Attendance in the Fourth Quarter

Booth School of Business, University of Chicago Business 41202, Spring Quarter 2013, Mr. Ruey S. Tsay. Midterm

Internet Appendix to The Booms and Busts of Beta Arbitrage

Two Way ANOVA in R Solutions

Market Variables and Financial Distress. Giovanni Fernandez Stetson University

Maximum Likelihood Estimation Richard Williams, University of Notre Dame, Last revised January 13, 2018

Getting Started in Logit and Ordered Logit Regression (ver. 3.1 beta)

STK Lecture 7 finalizing clam size modelling and starting on pricing

Transcription:

############################ ### toxo.r ### ############################ toxo < read.table(file="n:\\courses\\stat8620\\fall 08\\toxo.dat",header=T) #toxo < read.table(file="c:\\documents and Settings\\dhall\\My Documents\\Dan's Work Stuff\\courses\\STAT8620\\Fall 08\\toxo.dat",header=T) toxo$rain1000 < toxo$rainf/1000 toxo$ypos < round(toxo$ppos*toxo$n) toxo$yneg < toxo$n toxo$ypos toxo$samplogit < log((toxo$ypos+0.5)/(toxo$n toxo$ypos+0.5)) toxo[1:3,] rainf ppos n rain1000 ypos yneg samplogit 1 1735 0.500 4 1.735 2 2 0.0000000 2 1800 0.600 5 1.800 3 2 0.3364722 3 2050 0.292 24 2.050 7 17 0.8472979 plot(toxo$rain1000,toxo$ppos,main="prop positive versus rainfall (in 1000's)")

plot(toxo$rain1000,toxo$samplogit,main="samp log odds positive versus rainfall (in 1000's)") m1 < glm(cbind(ypos,yneg)~poly(rain1000,5),data=toxo, + family=binomial(link="logit")) summary(m1) Call: glm(formula = cbind(ypos, yneg) ~ poly(rain1000, 5), family = binomial(link = "logit"), data = toxo) Deviance Residuals: Min 1Q Median 3Q Max 2.9829 1.2096 0.4572 0.4160 2.8846 Coefficients: Estimate Std. Error z value Pr( z ) (Intercept) 0.02505 0.07709 0.325 0.74524 poly(rain1000, 5)1 0.24223 0.48608 0.498 0.61825 poly(rain1000, 5)2 0.23450 0.49023 0.478 0.63240 poly(rain1000, 5)3 1.46167 0.43170 3.386 0.00071 ***

poly(rain1000, 5)4 0.23823 0.47500 0.502 0.61599 poly(rain1000, 5)5 0.51553 0.46234 1.115 0.26484 Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 74.212 on 33 degrees of freedom Residual deviance: 61.196 on 28 degrees of freedom AIC: 163.89 Number of Fisher Scoring iterations: 3 m2 < glm(cbind(ypos,yneg)~poly(rain1000,3),data=toxo, + family=binomial(link="logit")) summary(m2) Call: glm(formula = cbind(ypos, yneg) ~ poly(rain1000, 3), family = binomial(link = "logit"), data = toxo) Deviance Residuals: Min 1Q Median 3Q Max 2.7620 1.2166 0.5079 0.3538 2.6204 Coefficients: Estimate Std. Error z value Pr( z ) (Intercept) 0.02427 0.07693 0.315 0.752401 poly(rain1000, 3)1 0.08606 0.45870 0.188 0.851172 poly(rain1000, 3)2 0.19269 0.46739 0.412 0.680141 poly(rain1000, 3)3 1.37875 0.41150 3.351 0.000806 *** Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 74.212 on 33 degrees of freedom Residual deviance: 62.635 on 30 degrees of freedom AIC: 161.33 Number of Fisher Scoring iterations: 3 anova(m2,m1,test="chisq") Analysis of Deviance Table Model 1: cbind(ypos, yneg) ~ poly(rain1000, 3) Model 2: cbind(ypos, yneg) ~ poly(rain1000, 5)

Resid. Df Resid. Dev Df Deviance P( Chi ) 1 30 62.635 2 28 61.196 2 1.438 0.487 m0 < glm(cbind(ypos,yneg)~1,data=toxo, family=binomial(link="logit")) anova(m0,m2,test="chisq") Analysis of Deviance Table Model 1: cbind(ypos, yneg) ~ 1 Model 2: cbind(ypos, yneg) ~ poly(rain1000, 3) Resid. Df Resid. Dev Df Deviance P( Chi ) 1 33 74.212 2 30 62.635 3 11.577 0.009 #deviance of model m2 is GOF statistic: deviance(m2) [1] 62.6346 #Pearson X^2 statistic: sum(resid(m2,type="pearson")^2) [1] 58.21314 m2q < glm(cbind(ypos,yneg)~poly(rain1000,3),data=toxo, + family=quasibinomial(link="logit")) summary(m2q) Call: glm(formula = cbind(ypos, yneg) ~ poly(rain1000, 3), family = quasibinomial(link = "logit"), data = toxo) Deviance Residuals: Min 1Q Median 3Q Max 2.7620 1.2166 0.5079 0.3538 2.6204 Coefficients: Estimate Std. Error t value Pr( t ) (Intercept) 0.02427 0.10716 0.226 0.8224 poly(rain1000, 3)1 0.08606 0.63897 0.135 0.8938 poly(rain1000, 3)2 0.19269 0.65108 0.296 0.7693 poly(rain1000, 3)3 1.37875 0.57321 2.405 0.0225 * Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 (Dispersion parameter for quasibinomial family taken to be 1.940446) Null deviance: 74.212 on 33 degrees of freedom Residual deviance: 62.635 on 30 degrees of freedom AIC: NA

Number of Fisher Scoring iterations: 3 r0 < seq(from=min(toxo$rain1000),to=max(toxo$rain1000),length=100) expit < function(x) {1/(1+exp( x)) } pred.m2 < predict(m2,data.frame(rain1000=r0),se.fit=t,type="link") L < expit(pred.m2$fit 1.96*pred.m2$se.fit) U < expit(pred.m2$fit+1.96*pred.m2$se.fit) plot(toxo$rain1000,toxo$ppos,type="p",xlab="rainfall/1000", + ylab="prop positive for toxoplasmosis", + main="fitted probability from model m2") lines(r0,expit( pred.m2$fit )) lines(r0,l,lty=4) lines(r0,u,lty=4) legend(locator(1),lty=c(1,4),legend=c("fitted probability","approx 95% conf. limits")) pred.m2q < predict(m2q,data.frame(rain1000=r0),se.fit=t,type="link") L < expit(pred.m2q$fit 1.96*pred.m2q$se.fit) U < expit(pred.m2q$fit+1.96*pred.m2q$se.fit)

plot(toxo$rain1000,toxo$ppos,type="p",xlab="rainfall/1000", + ylab="prop positive for toxoplasmosis", + main="fitted probability from model m2q") lines(r0,expit( pred.m2q$fit )) lines(r0,l,lty=4) lines(r0,u,lty=4) legend(locator(1),lty=c(1,4),legend=c("fitted probability","approx 95% conf. limits"))