CHAPTER 7 MULTIPLE REGRESSION ANSWERS TO PROBLEMS AND CASES 5. Y = 7.5 + 3(0) - 1.(7) = -17.88 6. a. A correlation matrix displays the correlation coefficients between every possible pair of variables in the analysis. b. The percentage of Y's variability that can be explained by the predictor variables is measured by R. It is also referred to as the coefficient of multiple determination. c. Collinearity results when predictor variables are too highly correlated among themselves. d. A residual is the difference between an actual Y value and the value predicted using the sample regression plane. e. A dummy variable in a regression analysis is a qualitative or categorical variable that is used as a predictor variable. f. The step-wise regression technique enters variables into the regression equation one at a time until all have been analyzed. 9. a. Both predictor variables are significantly related to the dependent variable. The predictor variables are highly correlated to each other indicating potential for multicollinearity. 67
b. When income is increased by one thousand dollars holding family size constant, the average increase in annual food expenditures is 8 dollars. When family size is increased by one person holding income constant, the average decrease in annual food expenditures is 41 dollars. Since family size is positively related to food expenditures, 737, it doesn t make sense that a decrease would take place. c. Multicollinearity is a problem as indicated by VIF s of 4.0. Use only one of the predictor variables. 1. Sales Outlets Outlets 0.739 Auto 0.548 0.670 The regression equation is Sales = 10.1 + 0.0110 Outlets + 0.195 Auto Predictor Coef StDev T P Constant 10.109 7.0 1.40 0.199 Outlets 0.010989 0.00500.11 0.068 Auto 0.1947 0.6398 0.30 0.769 S = 10.31 R-Sq = 55.1% R-Sq(adj) = 43.9% Analysis of Variance Source DF SS MS F P Regression 1043.7 51.8 4.91 0.041 Error 8 849.6 106. Total 10 1893. Source DF Seq SS Outlets 1 1033.8 Auto 1 9.8 Fit StDev Fit 95.0% CI 95.0% PI 1.51 5.08 ( 9.80, 53.3) (15.01, 68.0) a. Number of retail outlets is positively related, r 1 =.74, to annual sales and is potentially a good predictor variable. Number of automobiles registered is moderately related, r =.55, to annual sales and because of collinearity, r 13 3 =.67, will not be a good predictor variable when used in conjunction with number of retail outlets. b. Y = 10.11 +.011(011) +.195(4.6) = 36.997 (million) e = (Y - Y ) = (5.3-36.997) = 15.303 c. Y = 10.11 +.011(500) +.195(0.) = 41.549 (million) d. The standard error of estimate is 10.3 which is quite large. This equation only explains 55% of the sales variable variance. The prediction in part (c) is not very 68
accurate. e. sy.x = ( Y Y ) n k = 849. 565 ( 11 3) = 106195. = 10.3 f. If one retail outlet is added while the number of automobiles registered remains constant, sales will increase by an average of.011 million or $11,000 dollars. If one million more automobiles are registered while the number of retail outlets remains constant, sales will increase by an average of.195 million or $195,000 dollars. These regression coefficients are not valid due to collinearity. g. New predictor variables should be tried. 14. a. Reject H : 0 = 0 if t > 3.1 or if t < - 3.1. t =. 65. 05 = 13 Reject H 0 and conclude that the regression coefficient for the aptitude test variable is significantly different from zero in the population. Reject H : 0 3 = 0 if t > 3.1 or if t < - 3.1. t = 0. 6 = 1. 169. Reject H 0 and conclude that the regression coefficient for the effort index variable is significantly different from zero in the population. b. If the effort index increases one point while aptitude test score remains constant, sales performance increases by an average of $0.600. c. Y = 16.57 +.65(75) + 0.6(.5) = 75.6 d. s y.x (n - 3) = (3.56) (14-3) = 139.4 e. s y (n - 1) = (16.57) (14-1) = 3569.3 f. R = 1 - ( ) Y Y ( Y Y ) = 1-139. 4 3569. 3 = 1 -.039 =.961 We can explain 96.1% of the sales performance variable variance through our knowledge of the relationships between sales, aptitude, and effort index. g. R 1 SSE/( n k 1) SST /( n 1) 1 134.90 /11 3569.3/13.955 69
CHAPTER 8 REGRESSION WITH TIME SERIES DATA ANSWERS TO PROBLEMS AND CASES 1. a. The best model uses earnings per share and payout ratio to forecast common shareholders. The R for this model is 54.3%. A model that uses only dividends per share explains 5.44 percent of the common shareholder variance. b. Serial correlation is a serious problem with both models. c. A new predictor variable might be the answer. Otherwise some of the standard approaches to eliminating serial correlation need to be attempted. Share Earnings Dividend Earnings 0.568 Dividend 0.74 0.71 Payout 0.441-0.049 0.66 The regression equation is Share = 5107 + 6536 Earnings + 170 Payout Predictor Coef StDev T P Constant 5107 5654 0.90 0.375 Earnings 6536 1496 4.37 0.000 Payout 169.79 48.86 3.48 0.00 S = 3795 R-Sq = 54.3% R-Sq(adj) = 50.7% Analysis of Variance Source DF SS MS F P Regression 48445044 145 14.87 0.000 Error 5 360081348 1440354 Total 7 7885639 Source DF Seq SS Earnings 1 54506714 Payout 1 17393839 Durbin-Watson statistic = 0.7 14. a. The best model lags permits by quarters: Y = 0.36 + 9.33X r =.90 70
b. D.W. = 1.47 No autocorrelation c. Y = 16.6 +8.8X + 30X 3 r =.90 d. No. The simple regression equation explains almost as much variance as the multiple regression equation. e. No f. 004 1st quarter 177 nd quarter 11 Forecasts for the 3rd and 4th quarters can be done using several different approaches. This is best left to the student with an excellent discussion of why they used a particular method. One method that will definitely be suggested is to average the past values for the 1st and nd quarters and use the average in the model. This will result in forecasts: 3rd quarter 509 4th quarter 38 71