Ordinal Multinomial Logistic Regression. Thom M. Suhy Southern Methodist University May14th, 2013

Size: px
Start display at page:

Download "Ordinal Multinomial Logistic Regression. Thom M. Suhy Southern Methodist University May14th, 2013"

Transcription

1 Ordinal Multinomial Logistic Thom M. Suhy Southern Methodist University May14th, 2013

2 GLM Generalized Linear Model (GLM) Framework for statistical analysis (Gelman and Hill, 2007, p. 135) Linear Continuous data Logistic Binary data Ordered Multinomial Logistic Unordered Multinomial Logistic

3 Logistic Dependent variable is dichotomous Yes or No Apply or Not Apply Pass or Fail Heisman or no Heisman Probability of trait (yes, apply, pass, Heisman) based on independent variables Independent variable does not need to be dichotomous Categorical Integral Dichotomous Nominal Ordinal

4 Logistic Refresher Call: glm(formula = comply ~ physrec, family = binomial(link = "logit")) Deviance Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error z value Pr(> z ) (Intercept) e-06 *** physrec e-07 *** --- Signif. codes: 0 *** ** 0.01 * (Dispersion parameter for binomial family taken to be 1) Null deviance: on 163 degrees of freedom Residual deviance: on 162 degrees of freedom AIC: Number of Fisher Scoring iterations: 4

5 Logistic Refresher Formula for logit ## logit= (2.2882*physrec) ## ## logit = for no physrec ## ## logit =.4499 for yes physrec ## Probability to comply exp( )/(1+(exp( ))) Probability of comply with no physrec =.137 or 13.7% exp(.4499)/(1+(exp(.4499))) Probability of comply with physrec =.6106 or 61%

6 Logistic Refresher factor(comply) physrec

7 Logistic vs. Ordered Multinomial How are they different? An extension of logistic regression to multiple categories (Gelman & Hill, 2007) Not binary, categorical (but ordered) Decision (Yes, Maybe, No) Order of Finish (1 st, 2 nd, 3 rd ) Likert Scale (Strongly Disagree Strongly Agree) Income ranges (0 25K, 25K-50K, 50K+) Degree (None, Bachelors, Masters, PhD) There is unordered multinomial logistic regression, but that is not for today!

8 A Little More Information Ordinal multinomial logistic regression is an extension of logistic regression using multiple categories that have a logical order. (Gelman & Hill, 2007) Ordinal data are the most frequently encountered type of data in the social sciences (Johnson & Albert, 1999, p. 126).

9 Running a Model in R 1. We are going to use a file from UCLA, but first load your libraries : > library(psych) > library(arm) 2. Now we will read in our data: > suhy<- read.dta(url(" edu/stat/r/dae/ologit.dta"))

10 Running a Model in R 3. Let s examine our data: > head(suhy) apply pared public gpa 1 very likely somewhat likely unlikely somewhat likely somewhat likely unlikely

11 Running a Model in R Defining our variables: apply = Likelihood of college juniors applying to grad school. (Self-reported)(very likely, somewhat likely, unlikely) pared = Does at least one parent have a graduate degree? (no=0, yes=1) public = Undergrad was a private or public institution. (private = 0, public = 1) gpa = Undergrad grade point average

12 Running a Model in R What are our assumptions? Data are case specific iv has a single value for each case No perfect predictors no single predicator variable, iv can determine the outcome of the dv No zero or very small quantities in a crosstab cell Sample size larger than normal OLS regression

13 Running a Model in R 4. Let us check our assumptions: >xtabs(~suhy$pared+suhy$apply) suhy$pared unlikely somewhat likely very likely > xtabs(~suhy$public+suhy$apply) suhy$public unlikely somewhat likely very likely

14 Running a Model in R 5. We are good to go, let s run the model: > summary(m1<-bayespolr(as.ordered(suhy$apply)~suhy$gpa)) Call: bayespolr(formula = as.ordered(suhy$apply) ~ suhy$gpa) Coefficients: Value Std. Error t value suhy$gpa Intercepts: Value Std. Error t value unlikely somewhat likely somewhat likely very likely Residual Deviance: AIC:

15 Running a Model in R A visual: Thank you Pooja Shivraj (2012)

16 Running a Model in R 6. Lets calculate the probabilities for the average gpa > x<-mean(suhy$gpa) x = > coef<-m1$coef > coef suhy$gpa > intercept<-m1$zeta > intercept unlikely somewhat likely somewhat likely very likely

17 Running a Model in R 6. Let s calculate the probabilities for the average gpa (cont.) Remember: > prob<-function(input){exp(input)/(1+exp(input))} > (p0<-prob(intercept[1]-coef*x)) unlikely somewhat likely OR 55% > (p1<-prob(intercept[2]-coef*x)-p0) somewhat likely very likely OR 35% > (p2<-1-(p0+p1)) very likely OR 9% p0+p1+p2 always equal 1 when using 3 categories

18 Running a Model in R 2.5 GPA > (p0<-prob(intercept[1]-coef*2.5)) 3.7 GPA > (p0<-prob(intercept[1]-coef*3.7)) unlikely somewhat likely unlikely somewhat likely > (p1<-prob(intercept[2]-coef*2.5)-p0) > (p1<-prob(intercept[2]-coef*3.7)-p0) somewhat likely very likely somewhat likely very likely > (p2<-1-(p0+p1)) > (p2<-1-(p0+p1)) very likely very likely

19 Running a Model in R Now you tell me the probability for each category if you had a 4.0 GPA. > (p0<-prob(intercept[1]-coef*4.0)) unlikely somewhat likely = 37% > (p1<-prob(intercept[2]-coef*4.0)-p0) somewhat likely very likely = 44% > (p2<-1-(p0+p1)) very likely = 18%

20 Multiple Predictors 1. Let s look at a model with multiple predictors: > summary(m2<-bayespolr(as.ordered(suhy$apply)~suhy$gpa+suhy$pared+suhy$public)) Call: bayespolr(formula = as.ordered(suhy$apply) ~ suhy$gpa + suhy$pared + suhy$public) Coefficients: Value Std. Error t value suhy$gpa suhy$pared suhy$public Intercepts: Value Std. Error t value unlikely somewhat likely somewhat likely very likely Residual Deviance: AIC:

21 Multiple Predictors 2. Let s calculate the probabilities: >(coef<- m2$coef) suhy$gpa suhy$pared suhy$public > (intercept<-m2$zeta) unlikely somewhat likely somewhat likely very likely > mean(suhy$public) [1]

22 Multiple Predictors 2. Let s calculate the probabilities: (cont.) >(x1<-cbind(0:4, 0,.1425)) [,1] [,2] [,3] [1,] [2,] [3,] [4,] [5,] > (x2<-cbind(0:4, 1,.1425)) [,1] [,2] [,3] [1,] [2,] [3,] [4,] [5,]

23 Multiple Predictors Ordered Multinomial Logistic For pared = no (x1) > prob<-function(var){exp(var)/(1+exp(var))} > (p1<-prob(intercept[1]-x1 %*% coef)) [,1] [1,] [2,] [3,] [4,] [5,] > (p2<-prob(intercept[2]-x1 %*% coef)-p1) [,1] [1,] [2,] [3,] [4,] [5,] >p3<-1-(p1+p2) >p3 [,1] [1,] [2,] [3,] [4,] [5,] For pared = yes (x2) > prob<-function(var){exp(var)/(1+exp(var))} > (p4<-prob(intercept[1]-x2 %*% coef)) [,1] [1,] [2,] [3,] [4,] [5,] > (p5<-prob(intercept[2]-x2 %*% coef)-p4) [,1] [1,] [2,] [3,] [4,] [5,] > p6<-1-(p4+p5) > p6 [,1] [1,] [2,] [3,] [4,] [5,]

24 Graphing the Results >library(lattice) >Undergrad.GPA <-0:4 >plot(undergrad.gpa, p1, >type="l", col=1, ylim=c(0,1)) >lines(0:4, p2, col=2) >lines(0:4, p3, col=3) >lines(0:4, p4, col=1, lty = 2) >lines(0:4, p5, col=2, lty = 2) >lines(0:4, p6, col=3, lty = 2) >legend(1.5, 1, >legend=c("p(unlikely)", >"P(somewhat likely)", >"P(very likely)", "Line Type >when Pared = 0", >"Line Type when Pared = 1"), >col=c(1:3,1,1), >lty=c(1,1,1,1,2)) p P(unlikely) P(somewhat likely) P(very likely) Line Type when Pared = 0 Line Type when Pared = Undergrad.GPA

25 Why Not Linear The decision is not always black and white Large categories that are equally spaced could call for a simple linear model However, you must ALWAYS check your assumptions (Gelman &Hill, 2007)

26 Why Not Linear Here is why If we run our model using a simple linear model: >apply2<-as.numeric(suhy $apply) >m3<-lm(apply2~gpa, suhy) >summary(m3) Call: lm(formula = apply2 ~ gpa, data = suhy) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) ** gpa ** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 398 degrees of freedom Multiple R-squared: , Adjusted R- squared: F-statistic: on 1 and 398 DF, p-value: This is what we see when we check our assumptions

27 Why Not Linear Residuals Residuals vs Fitted Standardized residuals Scale-Location Fitted values Fitted values Normal Q-Q Residuals vs Leverage Standardized residuals Standardized residuals Cook's distance Theoretical Quantiles Leverage You Tell me..

28 References Gelman, A. & Hill, J. (2007). Data analysis using regression and multilevel/hierarchical models. NewYork: Cambridge University Press. Hoelze, B. (2009). analysis with the ordinal multinomial logistic model [PowerPoint slides]. Retrieved from ordered_multinomial.pptx Johnson, V. E. & Albert, J. H. (1999). Statistics for the social sciences and public policy: Ordinal data modeling. New York: Springer. Shivraj, P. (2011). Ordered multinomial logistic regression analysis [PowerPoint slides]. Retrieved from Ordered_ML_Shivraj.pdf. UCLA: Academic Technology Services. (n.d.). Retrieved from

ORDERED MULTINOMIAL LOGISTIC REGRESSION ANALYSIS. Pooja Shivraj Southern Methodist University

ORDERED MULTINOMIAL LOGISTIC REGRESSION ANALYSIS. Pooja Shivraj Southern Methodist University ORDERED MULTINOMIAL LOGISTIC REGRESSION ANALYSIS Pooja Shivraj Southern Methodist University KINDS OF REGRESSION ANALYSES Linear Regression Logistic Regression Dichotomous dependent variable (yes/no, died/

More information

Logistic Regression. Logistic Regression Theory

Logistic Regression. Logistic Regression Theory Logistic Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Logistic Regression The linear probability model.

More information

Multiple Regression and Logistic Regression II. Dajiang 525 Apr

Multiple Regression and Logistic Regression II. Dajiang 525 Apr Multiple Regression and Logistic Regression II Dajiang Liu @PHS 525 Apr-19-2016 Materials from Last Time Multiple regression model: Include multiple predictors in the model = + + + + How to interpret the

More information

############################ ### toxo.r ### ############################

############################ ### toxo.r ### ############################ ############################ ### toxo.r ### ############################ toxo < read.table(file="n:\\courses\\stat8620\\fall 08\\toxo.dat",header=T) #toxo < read.table(file="c:\\documents and Settings\\dhall\\My

More information

Dummy Variables. 1. Example: Factors Affecting Monthly Earnings

Dummy Variables. 1. Example: Factors Affecting Monthly Earnings Dummy Variables A dummy variable or binary variable is a variable that takes on a value of 0 or 1 as an indicator that the observation has some kind of characteristic. Common examples: Sex (female): FEMALE=1

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Scott Creel Wednesday, September 10, 2014 This exercise extends the prior material on using the lm() function to fit an OLS regression and test hypotheses about effects on a parameter.

More information

Stat 401XV Exam 3 Spring 2017

Stat 401XV Exam 3 Spring 2017 Stat 40XV Exam Spring 07 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning

More information

Predicting Charitable Contributions

Predicting Charitable Contributions Predicting Charitable Contributions By Lauren Meyer Executive Summary Charitable contributions depend on many factors from financial security to personal characteristics. This report will focus on demographic

More information

> budworm$samplogit < log((budworm$y+0.5)/(budworm$m budworm$y+0.5))

> budworm$samplogit < log((budworm$y+0.5)/(budworm$m budworm$y+0.5)) budworm < read.table(file="n:\\courses\\stat8620\\fall 08\\budworm.dat",header=T) #budworm < read.table(file="c:\\documents and Settings\\dhall\\My Documents\\Dan's Work Stuff\\courses\\STAT8620\\Fall

More information

boxcox() returns the values of α and their loglikelihoods,

boxcox() returns the values of α and their loglikelihoods, Solutions to Selected Computer Lab Problems and Exercises in Chapter 11 of Statistics and Data Analysis for Financial Engineering, 2nd ed. by David Ruppert and David S. Matteson c 2016 David Ruppert and

More information

CREDIT RISK MODELING IN R. Logistic regression: introduction

CREDIT RISK MODELING IN R. Logistic regression: introduction CREDIT RISK MODELING IN R Logistic regression: introduction Final data structure > str(training_set) 'data.frame': 19394 obs. of 8 variables: $ loan_status : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1

More information

Negative Binomial Model for Count Data Log-linear Models for Contingency Tables - Introduction

Negative Binomial Model for Count Data Log-linear Models for Contingency Tables - Introduction Negative Binomial Model for Count Data Log-linear Models for Contingency Tables - Introduction Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Negative Binomial Family Example: Absenteeism from

More information

Case Study: Applying Generalized Linear Models

Case Study: Applying Generalized Linear Models Case Study: Applying Generalized Linear Models Dr. Kempthorne May 12, 2016 Contents 1 Generalized Linear Models of Semi-Quantal Biological Assay Data 2 1.1 Coal miners Pneumoconiosis Data.................

More information

Logit Analysis. Using vttown.dta. Albert Satorra, UPF

Logit Analysis. Using vttown.dta. Albert Satorra, UPF Logit Analysis Using vttown.dta Logit Regression Odds ratio The most common way of interpreting a logit is to convert it to an odds ratio using the exp() function. One can convert back using the ln()

More information

Addiction - Multinomial Model

Addiction - Multinomial Model Addiction - Multinomial Model February 8, 2012 First the addiction data are loaded and attached. > library(catdata) > data(addiction) > attach(addiction) For the multinomial logit model the function multinom

More information

Bradley-Terry Models. Stat 557 Heike Hofmann

Bradley-Terry Models. Stat 557 Heike Hofmann Bradley-Terry Models Stat 557 Heike Hofmann Outline Definition: Bradley-Terry Fitting the model Extension: Order Effects Extension: Ordinal & Nominal Response Repeated Measures Bradley-Terry Model (1952)

More information

MCMC Package Example

MCMC Package Example MCMC Package Example Charles J. Geyer April 4, 2005 This is an example of using the mcmc package in R. The problem comes from a take-home question on a (take-home) PhD qualifying exam (School of Statistics,

More information

Using R to Create Synthetic Discrete Response Regression Models

Using R to Create Synthetic Discrete Response Regression Models Arizona State University From the SelectedWorks of Joseph M Hilbe July 3, 2011 Using R to Create Synthetic Discrete Response Regression Models Joseph Hilbe, Arizona State University Available at: https://works.bepress.com/joseph_hilbe/3/

More information

An Empirical Study on Default Factors for US Sub-prime Residential Loans

An Empirical Study on Default Factors for US Sub-prime Residential Loans An Empirical Study on Default Factors for US Sub-prime Residential Loans Kai-Jiun Chang, Ph.D. Candidate, National Taiwan University, Taiwan ABSTRACT This research aims to identify the loan characteristics

More information

Chapter 8 Exercises 1. Data Analysis & Graphics Using R Solutions to Exercises (May 1, 2010)

Chapter 8 Exercises 1. Data Analysis & Graphics Using R Solutions to Exercises (May 1, 2010) Chapter 8 Exercises 1 Data Analysis & Graphics Using R Solutions to Exercises (May 1, 2010) Preliminaries > library(daag) Exercise 1 The following table shows numbers of occasions when inhibition (i.e.,

More information

Categorical Outcomes. Statistical Modelling in Stata: Categorical Outcomes. R by C Table: Example. Nominal Outcomes. Mark Lunt.

Categorical Outcomes. Statistical Modelling in Stata: Categorical Outcomes. R by C Table: Example. Nominal Outcomes. Mark Lunt. Categorical Outcomes Statistical Modelling in Stata: Categorical Outcomes Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester Nominal Ordinal 28/11/2017 R by C Table: Example Categorical,

More information

Step 1: Load the appropriate R package. Step 2: Fit a separate mixed model for each independence claim in the basis set.

Step 1: Load the appropriate R package. Step 2: Fit a separate mixed model for each independence claim in the basis set. Step 1: Load the appropriate R package. You will need two libraries: nlme and lme4. Step 2: Fit a separate mixed model for each independence claim in the basis set. For instance, in Table 2 the first basis

More information

Getting Started in Logit and Ordered Logit Regression (ver. 3.1 beta)

Getting Started in Logit and Ordered Logit Regression (ver. 3.1 beta) Getting Started in Logit and Ordered Logit Regression (ver. 3. beta Oscar Torres-Reyna Data Consultant otorres@princeton.edu http://dss.princeton.edu/training/ Logit model Use logit models whenever your

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models Generalized Linear Models - IIIb Henrik Madsen March 18, 2012 Henrik Madsen () Chapman & Hall March 18, 2012 1 / 32 Examples Overdispersion and Offset!

More information

Logistic Regression with R: Example One

Logistic Regression with R: Example One Logistic Regression with R: Example One math = read.table("http://www.utstat.toronto.edu/~brunner/appliedf12/data/mathcat.data") math[1:5,] hsgpa hsengl hscalc course passed outcome 1 78.0 80 Yes Mainstrm

More information

Intro to GLM Day 2: GLM and Maximum Likelihood

Intro to GLM Day 2: GLM and Maximum Likelihood Intro to GLM Day 2: GLM and Maximum Likelihood Federico Vegetti Central European University ECPR Summer School in Methods and Techniques 1 / 32 Generalized Linear Modeling 3 steps of GLM 1. Specify the

More information

Random Effects ANOVA

Random Effects ANOVA Random Effects ANOVA Grant B. Morgan Baylor University This post contains code for conducting a random effects ANOVA. Make sure the following packages are installed: foreign, lme4, lsr, lattice. library(foreign)

More information

Non-linearities in Simple Regression

Non-linearities in Simple Regression Non-linearities in Simple Regression 1. Eample: Monthly Earnings and Years of Education In this tutorial, we will focus on an eample that eplores the relationship between total monthly earnings and years

More information

Study 2: data analysis. Example analysis using R

Study 2: data analysis. Example analysis using R Study 2: data analysis Example analysis using R Steps for data analysis Install software on your computer or locate computer with software (e.g., R, systat, SPSS) Prepare data for analysis Subjects (rows)

More information

book 2014/5/6 15:21 page 261 #285

book 2014/5/6 15:21 page 261 #285 book 2014/5/6 15:21 page 261 #285 Chapter 10 Simulation Simulations provide a powerful way to answer questions and explore properties of statistical estimators and procedures. In this chapter, we will

More information

Let us assume that we are measuring the yield of a crop plant on 5 different plots at 4 different observation times.

Let us assume that we are measuring the yield of a crop plant on 5 different plots at 4 different observation times. Mixed-effects models An introduction by Christoph Scherber Up to now, we have been dealing with linear models of the form where ß0 and ß1 are parameters of fixed value. Example: Let us assume that we are

More information

sociology SO5032 Quantitative Research Methods Brendan Halpin, Sociology, University of Limerick Spring 2018 SO5032 Quantitative Research Methods

sociology SO5032 Quantitative Research Methods Brendan Halpin, Sociology, University of Limerick Spring 2018 SO5032 Quantitative Research Methods 1 SO5032 Quantitative Research Methods Brendan Halpin, Sociology, University of Limerick Spring 2018 Lecture 10: Multinomial regression baseline category extension of binary What if we have multiple possible

More information

Final Exam - section 1. Thursday, December hours, 30 minutes

Final Exam - section 1. Thursday, December hours, 30 minutes Econometrics, ECON312 San Francisco State University Michael Bar Fall 2013 Final Exam - section 1 Thursday, December 19 1 hours, 30 minutes Name: Instructions 1. This is closed book, closed notes exam.

More information

Regression and Simulation

Regression and Simulation Regression and Simulation This is an introductory R session, so it may go slowly if you have never used R before. Do not be discouraged. A great way to learn a new language like this is to plunge right

More information

Getting Started in Logit and Ordered Logit Regression (ver. 3.1 beta)

Getting Started in Logit and Ordered Logit Regression (ver. 3.1 beta) Getting Started in Logit and Ordered Logit Regression (ver. 3. beta Oscar Torres-Reyna Data Consultant otorres@princeton.edu http://dss.princeton.edu/training/ Logit model Use logit models whenever your

More information

> attach(grocery) > boxplot(sales~discount, ylab="sales",xlab="discount")

> attach(grocery) > boxplot(sales~discount, ylab=sales,xlab=discount) Example of More than 2 Categories, and Analysis of Covariance Example > attach(grocery) > boxplot(sales~discount, ylab="sales",xlab="discount") Sales 160 200 240 > tapply(sales,discount,mean) 10.00% 15.00%

More information

Model fit assessment via marginal model plots

Model fit assessment via marginal model plots The Stata Journal (2010) 10, Number 2, pp. 215 225 Model fit assessment via marginal model plots Charles Lindsey Texas A & M University Department of Statistics College Station, TX lindseyc@stat.tamu.edu

More information

Maximum Likelihood Estimation Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 13, 2018

Maximum Likelihood Estimation Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 13, 2018 Maximum Likelihood Estimation Richard Williams, University of otre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 3, 208 [This handout draws very heavily from Regression Models for Categorical

More information

Regression Review and Robust Regression. Slides prepared by Elizabeth Newton (MIT)

Regression Review and Robust Regression. Slides prepared by Elizabeth Newton (MIT) Regression Review and Robust Regression Slides prepared by Elizabeth Newton (MIT) S-Plus Oil City Data Frame Monthly Excess Returns of Oil City Petroleum, Inc. Stocks and the Market SUMMARY: The oilcity

More information

6 Multiple Regression

6 Multiple Regression More than one X variable. 6 Multiple Regression Why? Might be interested in more than one marginal effect Omitted Variable Bias (OVB) 6.1 and 6.2 House prices and OVB Should I build a fireplace? The following

More information

Credit Risk Modelling

Credit Risk Modelling Credit Risk Modelling Tiziano Bellini Università di Bologna December 13, 2013 Tiziano Bellini (Università di Bologna) Credit Risk Modelling December 13, 2013 1 / 55 Outline Framework Credit Risk Modelling

More information

Lecture 21: Logit Models for Multinomial Responses Continued

Lecture 21: Logit Models for Multinomial Responses Continued Lecture 21: Logit Models for Multinomial Responses Continued Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University

More information

Catherine De Vries, Spyros Kosmidis & Andreas Murr

Catherine De Vries, Spyros Kosmidis & Andreas Murr APPLIED STATISTICS FOR POLITICAL SCIENTISTS WEEK 8: DEPENDENT CATEGORICAL VARIABLES II Catherine De Vries, Spyros Kosmidis & Andreas Murr Topic: Logistic regression. Predicted probabilities. STATA commands

More information

Multiple regression - a brief introduction

Multiple regression - a brief introduction Multiple regression - a brief introduction Multiple regression is an extension to regular (simple) regression. Instead of one X, we now have several. Suppose, for example, that you are trying to predict

More information

Logit Models for Binary Data

Logit Models for Binary Data Chapter 3 Logit Models for Binary Data We now turn our attention to regression models for dichotomous data, including logistic regression and probit analysis These models are appropriate when the response

More information

Graduate School of Business, University of Chicago Business 41202, Spring Quarter 2007, Mr. Ruey S. Tsay. Final Exam

Graduate School of Business, University of Chicago Business 41202, Spring Quarter 2007, Mr. Ruey S. Tsay. Final Exam Graduate School of Business, University of Chicago Business 41202, Spring Quarter 2007, Mr. Ruey S. Tsay Final Exam GSB Honor Code: I pledge my honor that I have not violated the Honor Code during this

More information

Determining Probability Estimates From Logistic Regression Results Vartanian: SW 541

Determining Probability Estimates From Logistic Regression Results Vartanian: SW 541 Determining Probability Estimates From Logistic Regression Results Vartanian: SW 541 In determining logistic regression results, you will generally be given the odds ratio in the SPSS or SAS output. However,

More information

Didacticiel - Études de cas. In this tutorial, we show how to implement a multinomial logistic regression with TANAGRA.

Didacticiel - Études de cas. In this tutorial, we show how to implement a multinomial logistic regression with TANAGRA. Subject In this tutorial, we show how to implement a multinomial logistic regression with TANAGRA. Logistic regression is a technique for maing predictions when the dependent variable is a dichotomy, and

More information

[BINARY DEPENDENT VARIABLE ESTIMATION WITH STATA]

[BINARY DEPENDENT VARIABLE ESTIMATION WITH STATA] Tutorial #3 This example uses data in the file 16.09.2011.dta under Tutorial folder. It contains 753 observations from a sample PSID data on the labor force status of married women in the U.S in 1975.

More information

Comparing Odds Ratios and Marginal Effects from Logistic Regression and Linear Probability Models

Comparing Odds Ratios and Marginal Effects from Logistic Regression and Linear Probability Models Western Kentucky University From the SelectedWorks of Matt Bogard Spring March 11, 2016 Comparing Odds Ratios and Marginal Effects from Logistic Regression and Linear Probability Models Matt Bogard Available

More information

Module 9: Single-level and Multilevel Models for Ordinal Responses. Stata Practical 1

Module 9: Single-level and Multilevel Models for Ordinal Responses. Stata Practical 1 Module 9: Single-level and Multilevel Models for Ordinal Responses Pre-requisites Modules 5, 6 and 7 Stata Practical 1 George Leckie, Tim Morris & Fiona Steele Centre for Multilevel Modelling If you find

More information

Statistics 175 Applied Statistics Generalized Linear Models Jianqing Fan

Statistics 175 Applied Statistics Generalized Linear Models Jianqing Fan Statistics 175 Applied Statistics Generalized Linear Models Jianqing Fan Example 1 (Kyhposis data): (The data set kyphosis consists of measurements on 81 children following corrective spinal surgery. Variable

More information

STA 4504/5503 Sample questions for exam True-False questions.

STA 4504/5503 Sample questions for exam True-False questions. STA 4504/5503 Sample questions for exam 2 1. True-False questions. (a) For General Social Survey data on Y = political ideology (categories liberal, moderate, conservative), X 1 = gender (1 = female, 0

More information

Multinomial Logit Models - Overview Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 13, 2017

Multinomial Logit Models - Overview Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 13, 2017 Multinomial Logit Models - Overview Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 13, 2017 This is adapted heavily from Menard s Applied Logistic Regression

More information

SUPPLEMENTARY ONLINE APPENDIX FOR: TECHNOLOGY AND COLLECTIVE ACTION: THE EFFECT OF CELL PHONE COVERAGE ON POLITICAL VIOLENCE IN AFRICA

SUPPLEMENTARY ONLINE APPENDIX FOR: TECHNOLOGY AND COLLECTIVE ACTION: THE EFFECT OF CELL PHONE COVERAGE ON POLITICAL VIOLENCE IN AFRICA SUPPLEMENTARY ONLINE APPENDIX FOR: TECHNOLOGY AND COLLECTIVE ACTION: THE EFFECT OF CELL PHONE COVERAGE ON POLITICAL VIOLENCE IN AFRICA 1. CELL PHONES AND PROTEST The Afrobarometer survey asks whether respondents

More information

Graduate School of Business, University of Chicago Business 41202, Spring Quarter 2007, Mr. Ruey S. Tsay. Midterm

Graduate School of Business, University of Chicago Business 41202, Spring Quarter 2007, Mr. Ruey S. Tsay. Midterm Graduate School of Business, University of Chicago Business 41202, Spring Quarter 2007, Mr. Ruey S. Tsay Midterm GSB Honor Code: I pledge my honor that I have not violated the Honor Code during this examination.

More information

Modelling the potential human capital on the labor market using logistic regression in R

Modelling the potential human capital on the labor market using logistic regression in R Modelling the potential human capital on the labor market using logistic regression in R Ana-Maria Ciuhu (dobre.anamaria@hotmail.com) Institute of National Economy, Romanian Academy; National Institute

More information

AIC = Log likelihood = BIC =

AIC = Log likelihood = BIC = - log: /mnt/ide1/home/sschulh1/apc/apc_examplelog log type: text opened on: 21 Jul 2006, 18:08:20 *replicate table 5 and cols 7-9 of table 3 in Yang, Fu and Land (2004) *Stata can maximize GLM objective

More information

Maximum Likelihood Estimation Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 10, 2017

Maximum Likelihood Estimation Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 10, 2017 Maximum Likelihood Estimation Richard Williams, University of otre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 0, 207 [This handout draws very heavily from Regression Models for Categorical

More information

Projects for Bayesian Computation with R

Projects for Bayesian Computation with R Projects for Bayesian Computation with R Laura Vana & Kurt Hornik Winter Semeter 2018/2019 1 S&P Rating Data On the homepage of this course you can find a time series for Standard & Poors default data

More information

Generalized Multilevel Regression Example for a Binary Outcome

Generalized Multilevel Regression Example for a Binary Outcome Psy 510/610 Multilevel Regression, Spring 2017 1 HLM Generalized Multilevel Regression Example for a Binary Outcome Specifications for this Bernoulli HLM2 run Problem Title: no title The data source for

More information

Lecture Note: Analysis of Financial Time Series Spring 2017, Ruey S. Tsay

Lecture Note: Analysis of Financial Time Series Spring 2017, Ruey S. Tsay Lecture Note: Analysis of Financial Time Series Spring 2017, Ruey S. Tsay Seasonal Time Series: TS with periodic patterns and useful in predicting quarterly earnings pricing weather-related derivatives

More information

Milestone2. Zillow House Price Prediciton. Group: Lingzi Hong and Pranali Shetty

Milestone2. Zillow House Price Prediciton. Group: Lingzi Hong and Pranali Shetty Milestone2 Zillow House Price Prediciton Group Lingzi Hong and Pranali Shetty MILESTONE 2 REPORT Data Collection The following additional features were added 1. Population, Number of College Graduates

More information

Lapse Modeling for the Post-Level Period

Lapse Modeling for the Post-Level Period Lapse Modeling for the Post-Level Period A Practical Application of Predictive Modeling JANUARY 2015 SPONSORED BY Committee on Finance Research PREPARED BY Richard Xu, FSA, Ph.D. Dihui Lai, Ph.D. Minyu

More information

COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 18, 2006, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTIONS

COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 18, 2006, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTIONS COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 18, 2006, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTIONS Answer all parts. Closed book, calculators allowed. It is important to show all working,

More information

Ordinal and categorical variables

Ordinal and categorical variables Ordinal and categorical variables Ben Bolker October 29, 2018 Licensed under the Creative Commons attribution-noncommercial license (http: //creativecommons.org/licenses/by-nc/3.0/). Please share & remix

More information

West Coast Stata Users Group Meeting, October 25, 2007

West Coast Stata Users Group Meeting, October 25, 2007 Estimating Heterogeneous Choice Models with Stata Richard Williams, Notre Dame Sociology, rwilliam@nd.edu oglm support page: http://www.nd.edu/~rwilliam/oglm/index.html West Coast Stata Users Group Meeting,

More information

Hierarchical Generalized Linear Models. Measurement Incorporated Hierarchical Linear Models Workshop

Hierarchical Generalized Linear Models. Measurement Incorporated Hierarchical Linear Models Workshop Hierarchical Generalized Linear Models Measurement Incorporated Hierarchical Linear Models Workshop Hierarchical Generalized Linear Models So now we are moving on to the more advanced type topics. To begin

More information

MODEL SELECTION CRITERIA IN R:

MODEL SELECTION CRITERIA IN R: 1. R 2 statistics We may use MODEL SELECTION CRITERIA IN R R 2 = SS R SS T = 1 SS Res SS T or R 2 Adj = 1 SS Res/(n p) SS T /(n 1) = 1 ( ) n 1 (1 R 2 ). n p where p is the total number of parameters. R

More information

9. Logit and Probit Models For Dichotomous Data

9. Logit and Probit Models For Dichotomous Data Sociology 740 John Fox Lecture Notes 9. Logit and Probit Models For Dichotomous Data Copyright 2014 by John Fox Logit and Probit Models for Dichotomous Responses 1 1. Goals: I To show how models similar

More information

XLSTAT TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING

XLSTAT TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING XLSTAT TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING INTRODUCTION XLSTAT makes accessible to anyone a powerful, complete and user-friendly data analysis and statistical solution. Accessibility to

More information

Logistic Regression Analysis

Logistic Regression Analysis Revised July 2018 Logistic Regression Analysis This set of notes shows how to use Stata to estimate a logistic regression equation. It assumes that you have set Stata up on your computer (see the Getting

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Ordinal Logistic Regression Dr. Tackett 11.27.2018 1 / 26 Announcements HW 8 due Thursday, 11/29 Lab 10 due Sunday, 12/2 Exam II, Thursday 12/6 2 / 26 Packages library(knitr)

More information

Session 5. Predictive Modeling in Life Insurance

Session 5. Predictive Modeling in Life Insurance SOA Predictive Analytics Seminar Hong Kong 29 Aug. 2018 Hong Kong Session 5 Predictive Modeling in Life Insurance Jingyi Zhang, Ph.D Predictive Modeling in Life Insurance JINGYI ZHANG PhD Scientist Global

More information

Lecture 10: Alternatives to OLS with limited dependent variables, part 1. PEA vs APE Logit/Probit

Lecture 10: Alternatives to OLS with limited dependent variables, part 1. PEA vs APE Logit/Probit Lecture 10: Alternatives to OLS with limited dependent variables, part 1 PEA vs APE Logit/Probit PEA vs APE PEA: partial effect at the average The effect of some x on y for a hypothetical case with sample

More information

The SAS System 11:03 Monday, November 11,

The SAS System 11:03 Monday, November 11, The SAS System 11:3 Monday, November 11, 213 1 The CONTENTS Procedure Data Set Name BIO.AUTO_PREMIUMS Observations 5 Member Type DATA Variables 3 Engine V9 Indexes Created Monday, November 11, 213 11:4:19

More information

Sociology Exam 3 Answer Key - DRAFT May 8, 2007

Sociology Exam 3 Answer Key - DRAFT May 8, 2007 Sociology 63993 Exam 3 Answer Key - DRAFT May 8, 2007 I. True-False. (20 points) Indicate whether the following statements are true or false. If false, briefly explain why. 1. The odds of an event occurring

More information

NHY examples. Bernt Arne Ødegaard. 23 November Estimating dividend growth in Norsk Hydro 8

NHY examples. Bernt Arne Ødegaard. 23 November Estimating dividend growth in Norsk Hydro 8 NHY examples Bernt Arne Ødegaard 23 November 2017 Abstract Finance examples using equity data for Norsk Hydro (NHY) Contents 1 Calculating Beta 4 2 Cost of Capital 7 3 Estimating dividend growth in Norsk

More information

Using New SAS 9.4 Features for Cumulative Logit Models with Partial Proportional Odds Paul J. Hilliard, Educational Testing Service (ETS)

Using New SAS 9.4 Features for Cumulative Logit Models with Partial Proportional Odds Paul J. Hilliard, Educational Testing Service (ETS) Using New SAS 9.4 Features for Cumulative Logit Models with Partial Proportional Odds Using New SAS 9.4 Features for Cumulative Logit Models with Partial Proportional Odds INTRODUCTION Multicategory Logit

More information

proc genmod; model malform/total = alcohol / dist=bin link=identity obstats; title 'Table 2.7'; title2 'Identity Link';

proc genmod; model malform/total = alcohol / dist=bin link=identity obstats; title 'Table 2.7'; title2 'Identity Link'; BIOS 6244 Analysis of Categorical Data Assignment 5 s 1. Consider Exercise 4.4, p. 98. (i) Write the SAS code, including the DATA step, to fit the linear probability model and the logit model to the data

More information

A Comparison of Univariate Probit and Logit. Models Using Simulation

A Comparison of Univariate Probit and Logit. Models Using Simulation Applied Mathematical Sciences, Vol. 12, 2018, no. 4, 185-204 HIKARI Ltd, www.m-hikari.com https://doi.org/10.12988/ams.2018.818 A Comparison of Univariate Probit and Logit Models Using Simulation Abeer

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maximum Likelihood Estimation EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #6 EPSY 905: Maximum Likelihood In This Lecture The basics of maximum likelihood estimation Ø The engine that

More information

Keywords Akiake Information criterion, Automobile, Bonus-Malus, Exponential family, Linear regression, Residuals, Scaled deviance. I.

Keywords Akiake Information criterion, Automobile, Bonus-Malus, Exponential family, Linear regression, Residuals, Scaled deviance. I. Application of the Generalized Linear Models in Actuarial Framework BY MURWAN H. M. A. SIDDIG School of Mathematics, Faculty of Engineering Physical Science, The University of Manchester, Oxford Road,

More information

David Tenenbaum GEOG 090 UNC-CH Spring 2005

David Tenenbaum GEOG 090 UNC-CH Spring 2005 Simple Descriptive Statistics Review and Examples You will likely make use of all three measures of central tendency (mode, median, and mean), as well as some key measures of dispersion (standard deviation,

More information

The data definition file provided by the authors is reproduced below: Obs: 1500 home sales in Stockton, CA from Oct 1, 1996 to Nov 30, 1998

The data definition file provided by the authors is reproduced below: Obs: 1500 home sales in Stockton, CA from Oct 1, 1996 to Nov 30, 1998 Economics 312 Sample Project Report Jeffrey Parker Introduction This project is based on Exercise 2.12 on page 81 of the Hill, Griffiths, and Lim text. It examines how the sale price of houses in Stockton,

More information

Lecture Note: Analysis of Financial Time Series Spring 2008, Ruey S. Tsay. Seasonal Time Series: TS with periodic patterns and useful in

Lecture Note: Analysis of Financial Time Series Spring 2008, Ruey S. Tsay. Seasonal Time Series: TS with periodic patterns and useful in Lecture Note: Analysis of Financial Time Series Spring 2008, Ruey S. Tsay Seasonal Time Series: TS with periodic patterns and useful in predicting quarterly earnings pricing weather-related derivatives

More information

Master Degree Exit Interview Computer Science

Master Degree Exit Interview Computer Science Polytechnic University of Puerto Rico Master Degree Exit Interview Computer Science 2014 Graduate School, Coop Program and Institutional Research Office 1 2 Sample Sample 2014 Population 11 Sample 8 Percent

More information

Duration Models: Parametric Models

Duration Models: Parametric Models Duration Models: Parametric Models Brad 1 1 Department of Political Science University of California, Davis January 28, 2011 Parametric Models Some Motivation for Parametrics Consider the hazard rate:

More information

Stat 328, Summer 2005

Stat 328, Summer 2005 Stat 328, Summer 2005 Exam #2, 6/18/05 Name (print) UnivID I have neither given nor received any unauthorized aid in completing this exam. Signed Answer each question completely showing your work where

More information

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2013, Mr. Ruey S. Tsay. Final Exam

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2013, Mr. Ruey S. Tsay. Final Exam The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2013, Mr. Ruey S. Tsay Final Exam Booth Honor Code: I pledge my honor that I have not violated the Honor Code during this

More information

Calculating the Probabilities of Member Engagement

Calculating the Probabilities of Member Engagement Calculating the Probabilities of Member Engagement by Larry J. Seibert, Ph.D. Binary logistic regression is a regression technique that is used to calculate the probability of an outcome when there are

More information

Building and Checking Survival Models

Building and Checking Survival Models Building and Checking Survival Models David M. Rocke May 23, 2017 David M. Rocke Building and Checking Survival Models May 23, 2017 1 / 53 hodg Lymphoma Data Set from KMsurv This data set consists of information

More information

Statistical Models of Stocks and Bonds. Zachary D Easterling: Department of Economics. The University of Akron

Statistical Models of Stocks and Bonds. Zachary D Easterling: Department of Economics. The University of Akron Statistical Models of Stocks and Bonds Zachary D Easterling: Department of Economics The University of Akron Abstract One of the key ideas in monetary economics is that the prices of investments tend to

More information

The Multivariate Regression Model

The Multivariate Regression Model The Multivariate Regression Model Example Determinants of College GPA Sample of 4 Freshman Collect data on College GPA (4.0 scale) Look at importance of ACT Consider the following model CGPA ACT i 0 i

More information

Lecture 13: Identifying unusual observations In lecture 12, we learned how to investigate variables. Now we learn how to investigate cases.

Lecture 13: Identifying unusual observations In lecture 12, we learned how to investigate variables. Now we learn how to investigate cases. Lecture 13: Identifying unusual observations In lecture 12, we learned how to investigate variables. Now we learn how to investigate cases. Goal: Find unusual cases that might be mistakes, or that might

More information

Girma Tefera*, Legesse Negash and Solomon Buke. Department of Statistics, College of Natural Science, Jimma University. Ethiopia.

Girma Tefera*, Legesse Negash and Solomon Buke. Department of Statistics, College of Natural Science, Jimma University. Ethiopia. Vol. 5(2), pp. 15-21, July, 2014 DOI: 10.5897/IJSTER2013.0227 Article Number: C81977845738 ISSN 2141-6559 Copyright 2014 Author(s) retain the copyright of this article http://www.academicjournals.org/ijster

More information

Econ 371 Problem Set #4 Answer Sheet. 6.2 This question asks you to use the results from column (1) in the table on page 213.

Econ 371 Problem Set #4 Answer Sheet. 6.2 This question asks you to use the results from column (1) in the table on page 213. Econ 371 Problem Set #4 Answer Sheet 6.2 This question asks you to use the results from column (1) in the table on page 213. a. The first part of this question asks whether workers with college degrees

More information

2016 FACULTY SALARY EQUITY ANALYSIS

2016 FACULTY SALARY EQUITY ANALYSIS 2016 FACULTY SALARY EQUITY ANALYSIS UNIVERSITY OF CALIFORNIA, SANTA BARBARA OFFICE OF THE EXECUTIVE VICE CHANCELLOR & THE FACULTY SALARY EQUITY STUDY COMMITTEE APRIL 2017 INTRODUCTION This report contains

More information

tm / / / / / / / / / / / / Statistics/Data Analysis User: Klick Project: Limited Dependent Variables{space -6}

tm / / / / / / / / / / / / Statistics/Data Analysis User: Klick Project: Limited Dependent Variables{space -6} PS 4 Monday August 16 01:00:42 2010 Page 1 tm / / / / / / / / / / / / Statistics/Data Analysis User: Klick Project: Limited Dependent Variables{space -6} log: C:\web\PS4log.smcl log type: smcl opened on:

More information

Statistics & Statistical Tests: Assumptions & Conclusions

Statistics & Statistical Tests: Assumptions & Conclusions Degrees of Freedom Statistics & Statistical Tests: Assumptions & Conclusions Kinds of degrees of freedom Kinds of Distributions Kinds of Statistics & assumptions required to perform each Normal Distributions

More information