Jenn Selensky gathered data from students in an introduction to psychology course. The data are weights, sex/gender, and whether or not the student worked-out in the gym. Here is the output from a 2 x 2 Sex x ANOVA and related other analyses. Proc Freq; Tables Sex* / CHISQ nopercent nocol; run; Frequency Row Pct The FREQ Procedure Table of Sex by Sex(Sex) () No Yes Total Male 23 45.10 28 54.90 51 Female 87 58.39 62 41.61 149 Total 110 90 200 Statistics for Table of Sex by Statistic DF Value Prob Chi- 1 2.7119 0.0996 Phi Coefficient -0.1164 Chi-square and phi do not equal zero. The ANOVA will be nonorthogonal. Men attended gym at a higher rate than did women. Sample Size = 200 Omnibus Analysis and Simple Main Effects Using Pooled Error PROC GLM data=gym; CLASS Sex ; MODEL Weight=Sex / SS3 EFFECTSIZE alpha=0.1; means Sex ; LSMEANS Sex* / SLICE=Sex; LSMEANS ; LSMEANS Sex; title 'Omnibus Analysis and Simple Main Effects Using Pooled Error'; run; quit; The GLM Procedure Class Level Information Class Levels Values Sex 2 Female Male 2 No Yes Number of Observations Read 200 Number of Observations Used 200
Dependent Variable: Weight Source The GLM Procedure DF Sum of s F Value Pr > F Model 3 39240.8884 13080.2961 18.27 <.0001 Error 196 140350.5284 716.0741 Corrected Total 199 179591.4168 R- Coeff Var Root MSE Weight 0.218501 18.49364 26.75956 144.6960 Proportion of Variation Accounted for Eta- 0.22 0.21 90% Confidence (0.13,0.29) Statistics associated with the total model (combined effect of sex and gym) are generally ignored in factorial ANOVA. Source DF Type III SS F Value Pr > F Sex 1 30650.58425 30650.58425 42.80 <.0001 1 4645.08243 4645.08243 6.49 0.0116 Sex* 1 3632.69319 3632.69319 5.07 0.0254 Total Variation Accounted For Eta- Conservative 90% Confidence Partial Variation Accounted For Partial Eta- Partial 90% Confidence Sex 0.1707 0.1660 0.0975 0.2472 0.1792 0.1729 0.1034 0.2547 0.0259 0.0218 0.0019 0.0722 0.0320 0.0267 0.0039 0.0811 Sex x 0.0202 0.0162 0.0003 0.0632 0.0252 0.0200 0.0016 0.0708 All three effects are significant. Converting Cohen s benchmarks for r into proportions of variance (r 2 ).01 = small.09 = medium.25 = large
The interaction is monotonic. The effect of gym is larger in men than in women The schematic plot above calls into doubt the normality assumption. Observation 133 has weight = 265 pounds. Warning: SAS observation numbers will not match SPSS case numbers. Furthermore, the SAS observation numbers may vary from one importation of the SPSS data to another importation. Level of Sex N Weight Female 149 137.110738 26.3248191 Male 51 166.856863 29.4648282 The men weigh significantly more (29.7 pound difference) than the women, d = 1.10.
Level of N Weight No 110 140.497273 29.2741626 Yes 90 149.827778 30.3299423 Those who attend gym weigh significantly more than those who do not, d =.36. Level of Sex Level of N Weight Female No 87 136.574713 28.7853452 Female Yes 62 137.862903 22.6317942 Male No 23 155.334783 26.7556894 Male Yes 28 176.321429 28.6085374
Sex The GLM Procedure Least s s DF Sum of s F Value Pr > F Female 1 60.073774 60.073774 0.08 0.7724 Male 1 5561.625781 5561.625781 7.77 0.0058 The simple main effect of gym attendance is significant for the men but not for the women. Eta- Total Variation Accounted For Conservative 90% Confidence Partial Variation Accounted For Partial Eta- Partial 90% Confidence Female 0.0003-0.0036 0.0000 0.0141 0.0004-0.0046 0.0000 0.0153 Male 0.0310 0.0269 0.0036 0.0800 0.0381 0.0327 0.0063 0.0899 The proportions of variance here have as the denominator all of the variance in weights, for both men and for women. I think it makes more sense to use a denominator that is the within-sex total variance, which I compute below. Least s s Weight LSMEAN Observed No 145.954748 140.497273 Yes 157.092166 149.827778 Notice that the adjusted means differ by more (10.2) than do the observed means (9.3). Least s s Sex Weight LSMEAN Observed Female 137.218808 137.110738 Male 165.828106 166.856863 Notice that the adjusted means differ by less (28.6) than do the observed means (29.8). The tests for the main effects here test the null that the adjusted mean are equal, not the null that the unadjusted means are equal. A simple t test would test the null that the unadjusted means are equal. Proc Sort; By Sex; run; Proc GLM; Class ; Model Weight= / SS1 EFFECTSIZE alpha=0.1; By Sex; title 'simple main effects, individual error'; run; quit; simple main effects, individual error The GLM Procedure Sex=Male Class Level Information Class Levels Values 2 No Yes
Number of Observations Read 51 Number of Observations Used 51 Dependent Variable: Weight Sex=Male Source DF Sum of s F Value Pr > F Model 1 5561.62578 5561.62578 7.20 0.0099 Error 49 37847.17932 772.39141 Corrected Total 50 43408.80510 R- Coeff Var Root MSE Weight 0.128122 16.65615 27.79193 166.8569 Source DF Type I SS F Value Pr > F Total Variation Accounted For Eta- Conservative 90% Confidence 1 5561.625781 5561.625781 7.20 0.0099 0.1281 0.1084 0.0179 0.2728 In the men, 11% of the variance in weights is associated with gym usage. Sex=Female Class Level Information Class Levels Values 2 No Yes Number of Observations Read 149 Number of Observations Used 149 Dependent Variable: Weight Sex=Female Source DF Sum of s F Value Pr > F Model 1 60.0738 60.0738 0.09 0.7695 Error 147 102503.3490 697.3017 Corrected Total 148 102563.4228 R- Coeff Var Root MSE Weight 0.000586 19.25923 26.40647 137.1107
Source DF Type I SS F Value Pr > F Total Variation Accounted For Eta- Conservative 90% Confidence 1 60.07377356 60.07377356 0.09 0.7695 0.0006-0.0062 0.0000 0.0207 As noted earlier, the boxplots called into question the normality assumption. Let s see how badly skewed the within-cell distributions are. proc sort; by sex gym; proc means skewness kurtosis; var weight; by sex gym; run;
Sex=Male =No Analysis Variable : Weight Skewness Kurtosis 0.8300845 0.7785344 Sex=Male =Yes Analysis Variable : Weight Skewness Kurtosis 0.9175307-0.1647111 Sex=Female =No Analysis Variable : Weight Skewness Kurtosis 1.7153873 4.8202045 Sex=Female =Yes Analysis Variable : Weight Skewness Kurtosis 0.9875084 0.3203789 We really ought to try a nonparametric analysis here. proc npar1way wilcoxon; class gym; by sex; run; The NPAR1WAY Procedure Sex=Male Wilcoxon Scores (Rank Sums) for Variable Weight Classified by Variable N Sum of Scores Expected Under H0 Under H0 Score No 23 454.50 598.0 52.758592 19.760870 Yes 28 871.50 728.0 52.758592 31.125000 Average scores were used for ties.
Wilcoxon Two-Sample Test Statistic 454.5000 Normal Approximation Z -2.7105 One-Sided Pr < Z 0.0034 Two-Sided Pr > Z 0.0067 attendance is significantly associated with weight in the men. Sex=Female Wilcoxon Scores (Rank Sums) for Variable Age Classified by Variable N Sum of Scores Expected Under H0 Under H0 Score No 87 6415.50 6525.0 240.504738 73.741379 Yes 62 4759.50 4650.0 240.504738 76.766129 Average scores were used for ties. Wilcoxon Two-Sample Test Statistic 4759.5000 Normal Approximation Z 0.4532 One-Sided Pr > Z 0.3252 Two-Sided Pr > Z 0.6504 attendance is not significantly related to weight in the women.