Module 13: Autocorrelation Problem Module 15: Autocorrelation Problem(Contd.)

6 P age Module 13: Autocorrelation Problem Module 15: Autocorrelation Problem(Contd.) Rudra P. Pradhan Vinod Gupta School of Management Indian Institute of Technology Kharagpur, India Email: rudrap@vgsom.iitkgp.ernet 6

7 P age Now, two more tests of autocorrelation can be done. One is the graphical representation and the other is the Runs test First, the graph, which is presented below: u i 1 u i We can see there is a strong positive relation between u i and u i 1, suggesting autocorrelation of the first order, i.e., u i = u i 1 + t For the runs test, in Column VI of the above table, the total number of positive values (N 1 ), and the total number of negative values (N 2 ) are counted. We also count that R=80 times u i changed signs from positive to negative and then from negative to positive. So, for the Runs test, we have N 1 = 677, N 2 = 720, N=N 1 +N 2 = 1397, and R = 80. So, E(R) = + 1 = + 1 = 698.84 Var(R) = = = 348.34 Then, for 95% confidence level (or equivalently, 5% significance level), if observed R falls between E(R)±1.96 R, then there is no autocorrelation, but if it falls outside, it can be said that autocorrelation exists. Now, E(R) + 1.96 R = 698.84 + 1.96 * 348.34 = 735.42 and E(R) - 1.96 R = 698.84 1.96 * 348.34 = 662.26 7

8 P age But observed R is 80, and it lies outside the interval [662.26, 735.42] So, again we infer that there IS autocorrelation. So now, we can say that we cannot believe the significances of the i s that we got from OLS. We first have to remove the autocorrelation and then find the significance to make correct conclusions about which factors affect gold prices. That is, our detection of autocorrelation is over, and we can now move to the next step, that is, removal of autocorrelation. There are 4 methods to remove autocorrelation: 1. First difference Method (applicable here, since d<r 2 ) 2. Find from Durbin Watson d statistic, and use Generalised Least Squares Regression 3. Find from Eqn. 2, and use Generalised Least Squares Regression 4. Use iterative Methods like Cochrane Orcutt (need to use softwares for this) Method 1: First Difference Method: Gold price t = 1 + 2 * USD exchange rate t + 3 * Sensex t + 4 * oil price t ( ) Gold price t 1 = 1 + 2 * USD exchange rate t 1 + 3 * Sensex t 1 + 4 * oil price t 1 (Gold price t Gold price t 1 ) = 2 * (USD rate t USD rate t 1 ) + 3 * (Sensex t Sensex t 1 ) + 4 * (oil price t oil price t ) t = 2 * t + 3 * t + 4 * t where t = (Gold price t Gold price t 1 ), t = (USD rate t USD rate t 1 ), t = (Sensex t Sensex t 1 ) and t = (oil price t oil price t ) The results from the OLS are as follows: = 25.293 * + ( 0.182) * + 0.165 * Std Error 19.1136 0.027703 0.023547 t 1.323284 6.56367 6.995892 R 2 0.036757 Durbin Watson 2.061111 d L = 1.645 (From Durbind U = 1.692 Watson Tables) 8

9 P age We see that the Durbin Watson d statistic shows that there is no autocorrelation in the first difference equation. Now we can believe the values of i s. t statistic for 2 is less than 2, so it is statistically insignificant => USD exchange rate does not affect gold price significantly; t statistic for 3 is more than 2, so it is statistically significant => Sensex affects gold price significantly and negatively; and t statistic for 4 is more than 2, so it is statistically significant => Oil price affects gold price significantly and positively. Method 2: Find from Durbin Watson d statistic, and use Generalised Least Squares Regression From the original OLS, we got Durbin Watson d statistic as: 0.039. d =1.645 =1.692 calc =2.308 =2.355 =2.061 It has been established that for large samples, d 2(1 ). Then, 1 d/2 = 1 0.039/2 = 0.98 X Gold price t = 1 + 2 * USD exchange rate t + 3 * Sensex t + 4 * oil price t ( ) * Gold price t 1 = * 1 + * 2 * USD exchange rate t 1 + * 3 * Sensex t 1 + * 4 * oil price t 1 (Gold price t * Gold price t 1 ) = ( 1 * 1 ) + 2 * (USD rate t * USD rate t 1 ) + 3 * (Sensex t * Sensex t 1 ) + 4 * (oil price t * oil price t ) t = 1 * (1 ) + 2 * t + 3 * t + 4 * t where t = (Gold price t * Gold price t 1 ), t = (USD rate t * USD rate t 1 ), t = (Sensex t * Sensex t 1 ) and t = (oil price t * oil price t ) Now we compute the above values in the following table: The results from the OLS are as follows: 9

10 P age = 25.293 * + ( 0.182) * + 0.165 * Std Error 19.1136 0.027703 0.023547 t 1.323284 6.56367 6.995892 R 2 0.036757 Durbin Watson 1.775142 d L = 1.645 (From Durbind U = 1.692 Watson Tables) Here also Durbin Watson d statistic shows that there is no autocorrelation in the generalized equation. Now we can believe the values of i s. t statistic for 2 is less than 2, so it is statistically insignificant => USD exchange rate does not affect gold price significantly; t statistic for 3 is more than 2, so it is statistically significant => Sensex affects gold price significantly and negatively; and t statistic for 4 is more than 2, so it is statistically significant => Oil price affects gold price significantly and positively. Method 3: Find from Eqn. 2, and use Generalised Least Squares Regression Using the values of u t and u t 1 which were computed in the first table (Page 3), we evaluate Eqn 2 using OLS, and get the following: X d =1.645 =1.692 calc =2.308 =2.355 =1.775 u t = 0.979031 u t 1 + t R 2 0.961 That is = 0.979 0.98, same as Case 2. The results of the Generalized Least Squares also are the same. Method 4: Iterative Method, specifically the Cochrane Orcutt procedure 10

11 P age The algebra here is more involved, so I shall not write about it. In fact, this method is normally available in any software that handles Time Series Analysis. Here I give the output : Gold price 3645.848 25.801 * USD Exch. Rate ( 0.181) * Sensex + 0.165 * oil price = + + Std Error 3267.98069 19.119 0.028 0.024 t 1.116 1.349 6.554 6.994 R 2 0.037 Durbin Watson 2.04 d L = 1.645 (From Durbind U = 1.692 Watson Tables) Here also Durbin Watson d statistic shows that there is no autocorrelation in the generalized equation. Now we can believe the values of i s. t statistic for 2 is less than 2, so it is statistically insignificant => USD exchange rate does not affect gold price significantly; t statistic for 3 and 4 are more than 2, so they are statistically significant => Sensex and oil price affect gold price significantly. Final comment: From the OLS of Eqn 1, all coefficients were highly significant. But autocorrelation was detected and removed, after which only 2 coefficients remained significant. I hope this clarifies why it is necessary to detect and remove autocorrelation. References for Further Reading: Dielman, Terry E.: Applied Regression Analysis for Business and Economics, PWS Kent, Boston, 1991. Draper, N. R., and H. Smith: Applied Regression Analysis, 3d ed., John Wiley & Sons, New York, 1998. Frank, C. R., Jr.: Statistics and Econometrics, Holt, Rinehart and Winston, New York, 1971. Goldberger, Arthur S.: Introductory Econometrics, Harvard University Press, 1998. Graybill, F. A.: An Introduction to Linear Statistical Models, vol. 1, McGraw Hill, New York, 1961. Greene, William H.: Econometric Analysis, 4th ed., Prentice Hall, Englewood Cliffs, N. J., 2000. Griffiths, William E., R. Carter Hill and George G. Judge: Learning and Practicing Econometrics, John Wiley & Sons, New York, 1993. Gujarati, Damodar N.: Essentials of Econometrics, 2d ed., McGraw Hill, New York, 1999. Hill, Carter, William Griffiths, and George Judge: Undergraduate Econometrics, John Wiley & Sons, New York, 2001. Hu, Teh Wei: Econometrics: An Introductory Analysis, University Park Press, Baltimore, 1973. Johnston, J.: Econometric Methods, 3d ed., McGraw Hill, New York, 1984. 11

12 P age Katz, David A.: Econometric Theory and Applications, Prentice Hall, Englewood Cliffs, N.J., 1982. Klein, Lawrence R.: An Introduction to Econometrics, Prentice Hall, Englewood Cliffs, N.J., 1962. Koop, Gary: Analysis of Economic Data, John Wiley & Sons, New York, 2000. Koutsoyiannis, A.: Theory of Econometrics, Harper & Row, New York, 1973. Maddala, G. S.: Introduction to Econometrics, John Wiley & Sons, 3d ed., New York, 2001. Mills, T. C.: The Econometric Modelling of Financial Time Series, Cambridge University Press, 1993. Mittelhammer, Ron C., George G. Judge, and Douglas J. Miller: Econometric Foundations, Cambridge University Press, New York, 2000. Mukherjee, Chandan, Howard White, and Marc Wuyts: Econometrics and Data Analysis for Developing Countries, Routledge, New York, 1998. Pindyck, R. S., and D. L. Rubinfeld: Econometric Models and Econometric Forecasts, 4th ed., McGraw Hill, New York, 1990. Verbeek, Marno: A Guide to Modern Econometrics, John Wiley & Sons, New York, 2000. Walters, A. A.: An Introduction to Econometrics, Macmillan, London, 1968. Wooldridge, Jeffrey M.: Introductory Econometrics, South Western College Publishing, 2000. Faqs (Frequently Asked Questions): 1. Estimation using OLS on autocorrelation data results in the parameters being estimated to be a) Biased b) Inconsistent c) Asymptotically normally distributed d) Inefficient e) Efficient 2. For H O of no autocorrelation to be not rejected the Durbin Watson d statistics should be a) Equals to 2 b) Equals to zero c) Equals to 4 d) Equals to 1 e) Equals to 10 3. One of the easiest ways of detecting autocorrelation is the graphical method where we a) Plot the error terms against their standardized values b) Plot the error terms against each X variable c) Plot the error terms against each Y variable d) Plot the error terms against time 4. This is not an important assumption for computing the d statistics a) The regression model includes an intercept term b) The explanatory variables are fixed in repeated sampling c) The error terms are generated by the first order auto regressive scheme d) The error terms are not correlated with each other 12

13 P age Self Evaluation Tests/ Quizzes 1. In a study of the determination of prices of final output at factor cost in the United Kingdom, the following results were obtained on the basis of annual data for the period 1951 1969: PF t = 2.033 + 0.273W t 0.521X t + 0.256M t + 0.028M t-1 + 0.121PF t-1 se = (0.992) (0.127) (0.099) (0.024) (0.039) (0.119) R 2 = 0.984 d = 2.54 Where PF = prices of final output at factor cost, W = wages and salaries per employee, X = gross domestic product per person employed, M = import prices, M t 1 = import prices lagged 1 year, and PF t 1 = prices of final output at factor cost in the previous year. Since for 18 observations and 5 explanatory variables, the 5 percent lower and upper d values are 0.71 and 2.06, the estimated d value of 2.54 indicates that there is no positive autocorrelation. Comment 2. Given a sample of 50 observations and 4 explanatory variables, what can you say about autocorrelation if (a) d = 1.05? (b) d = 1.40? (c) d = 2.50? (d) d = 3.97? 3. Give circumstances under which each of the following methods of estimating the firstorder coefficient of autocorrelation ρ may be appropriate: a) First-difference regression b) Moving average regression c) Theil Nagar transform d) Cochrane and Orcutt iterative procedure e) Hildreth Lu scanning procedure f) Durbin two-step procedure 4. State whether the following statements are true or false. Briefly justify your answer. a) When autocorrelation is present, OLS estimators are biased as well as inefficient. b) The Durbin Watson d test assumes that the variance of the error term ut is homoscedastic. c) The first-difference transformation to eliminate autocorrelation assumes that the coefficient of autocorrelation ρ is 1. d) The R 2 values of two models, one involving regression in the first difference form and another in the level form, are not directly comparable. e) A significant Durbin Watson d does not necessarily mean there is autocorrelation of the first order. f) In the presence of autocorrelation, the conventionally computed variances and standard errors of forecast values are inefficient. g) The exclusion of an important variable(s) from a regression model may give a significant d value. 13