Homework Assignments for BusAdm 713: Business Forecasting Methods. Assignment 1: Introduction to forecasting, Review of regression

Homework Assignments for BusAdm 713: Business Forecasting Methods Note: Problem points are in parentheses. Assignment 1: Introduction to forecasting, Review of regression 1. (3) Complete the exercises listed on pages 57, 59, 61, and 62 of my online probability and statistics tutorial, www4.uwm.edu/people/haas/learnprobstat 2. (1) List how sales, marketing, and finance departments within a firm often apply judgemental overrides to the demand forecast and explain what their political motives are for doing so. 3. (1) In the real world, how reliable are simple exponential smoother-based forecasts? 4. (1) In Chase, Jr. s opinion, what is the best way to form a demand forecast? 5. (1) Say that on days 1-5, you observe stock A s price to be 14, 19, 20, 12, and 15, respectively. On these same days, you observe stock B s price to be 7, 9, 22, 14, and 17. Compute by hand the sample covariance and the sample correlation between these two time series. Now use proc corr to check your results. 6. (2) Using relationships on BD, p. 10 (including d = 2(1 ρ)), compare the approximate critical value found with lines 6 and 7 with a Durbin-Watson table (1 predictor). At what n does the approximation give a critical value more than 10% different than the correct one? 7. (1) Rerun the Treasury Bills example on p. 24 with proc autoreg. Select NLAG so that the Durbin-Watson test of the final residuals does not reject. 1

Assignment 2: Demand-driven forecasting, Regressionbased forecasting 1. (1) Create a sequential, step-by-step procedure for Harley-Davidson to follow to create a weekly demand forecast for their new Dyna Switchback motorcycle. For each step, indicate (a) the responsible department, (b) needed data, (c) needed software, and/or (d) reports to be produced. 2. (1) The following data set consists of product demand observed at 5 time points: Time Demand 1 5 2 8 3 9 4 12 5 13 Let Demand at time t be denoted Y t. After regressing Y t onto ln t, forecast demand at t = 7. Give a 95% prediction interval for this forecast. Use the Durbin-Watson statistic to test for autocorrelation of the error term. 3. (NOT ASSIGNED but an instructive exercise) Verify the values on lines 4, 5, and 7 in BD, p. 30. The first line of this page is: and the forecast error variance 4. (2) Show how the equation that is the third line from the bottom of p. 40 was obtained. The second to last line of this page is: Thus, you can write Y t as 5. (1) Show how equation 2.7, p. 41 was obtained. 6. (2) Show how line 1, p. 43 was obtained. This line begins with: These can be solved with 7. (2) Why does the absolute value of M have to be greater than 1 in line 6, p. 44? The first line of this page begins with: Surprisingly, 8. (1) Refit the silver data with proc autoreg where NLAG is set to 4. Compare your output to the output on p. 47 Output 2.5... (continued). What is different? Why? 2

Assignment 3: Forecasting methods overview, Autoregressionbased forecasting 1. (2) Using Chase Jr., pp. 35-36, explain how a survey could be used to replace a judgmental override when creating a demand forecast for the nearly-new product of primer-plus-finishcoat house paint. 2. (2) Compute and plot the ACF for the noiseless data set {1 2 3 4 5 6 7 8 9}. Why is the function a negatively-sloping line of time lag? 3. (3) Simulate a sample of 15 independent and normal deviates having a mean of 2 and a variance of 14. Compute and plot the ACF. Now simulate a sample of 5000 deviates with the same distribution and plot their ACF. Why do the two plots have different autocorrelation values at identical lags? Why are the confidence bands different between the two plots? Which, if any plot indicates autocorrelation? Why? 4. (3) Read Jagpal et al. (1982) and look at g:/haas T/busadm713/trnsloghints.pdf. I have written a SAS code (g:/haas T/busadm713/pcreg.sas) that does the following. (a) Reads-in the four columns of data on the Lydia Pinkham Company data that is available at g:/haas T/jagpal/pinkham.dat. Column 1 is the month, column 2 is the year, column 3 is advertising expenditures (in $100s), and column 4 is sales (also in $100s). (b) Creates all necessary variables and then uses principal components regression to fit the 2-period translog model to this data. (c) Creates new variables as necessary that are simply the values of other variables lagged by the number of time steps j. For example, if 5 9 3 7 contains observations on advertising at times 1, 2, 3, and 4, then a new variable composed of this original variable lagged back 1 time step would be. 5 9 3 where. denotes a missing value (since there is no value earlier in time than the first observation). (d) Computes the marginal productivity (MP i (t)) for advertising at lag 0, 1, and 2 for each month (t). Also computes the total yield, S t c = 2 i=0 MP i (t+i) for each month. Finally, a plot S t c as a function of t is drawn. Questions: (a) What are the advantages of using translog functions to model the relationship between advertising and sales? 3

(b) Why is it difficult to justify a negative marginal productivity? (c) Insert the two lines: proc print; run; into the code just after Step 7 s data step command set, and just after the proc transpose command set. Exactly what modifications have been made to data set a? (d) Run the code and produce the plot. Are the total yield values from the PC regression similar to those computed from the ridge regression parameter estimates? (e) What patterns of advertising budget for the current, next month, and two months in the future should management adopt, i.e., what patterns of advertising expenditure give the highest total yield (Sc)? t 4

Assignment 4: Forecast performance 1. (5) Perform a residual analysis of the regression model fitted in Assignment 3, problem 4. Create all residual plots. Compute the PRESS and MAPE statistics. Compute the Durbin-Watson and Modified Box-Pierce (Ljung-Box) tests for zero autocorrelation. Also, compute the Lagrange Multiplier test for homoscedasticity (equal variances across time) for the time predictor only. Use proc autoreg to perform these tests. 2. (3) Compute the last three one-step-ahead prediction errors of the regression-based predictor of Assignment 2, problem 2 and plot them by time. This means that you predict the value at time 3 using the data at times 1 and 2; then predict the value at time 4 using the data at times 1, 2, and 3; and then predict the value at time 5 using the data at times 1, 2, 3, and 4. You will need to run proc reg three times to compute these forecasts. Do not use proc arima. 3. (2) Repeat problem 2 but use the Naive (Simple) Forecast (each prediction equals the value at the current time). This forecasting method involves no calculation so compute these forecasts by hand. Which of these two forecast methods has a lower one-step-ahead prediction error? 5

Assignment 5: Smoothing methods, Introduction to ARIMA modeling 1. (1) Write the moving average smoother, 1 [1, 1, 1, 1] in backshift notation (B). 4 2. (2) Consider the noiseless time series 2 4 6 8 10 12 14 16 18 20. Compute a period 11 forecast using (a) single exponential smoothing, (b) double exponential smoothing, and (c) Holt s method of linear exponential smoothing. Which of these methods is more appropriate? Why? What value of α did you use in method (a)? How can you explain it in light of the relationship F t+1 = αx t + (1 α)f t = F t + α(x t F t ) where F t is a forecasted value at time t, and X t is an observed time series at time t? Explain how you selected values for the constants in methods (b) and (c). 3. (2) Apply the Holt-Winters smoother to the following demand data and forecast demand for the next three seasons. Find parameter values by minimizing the MAPE. 1980 winter 8.5 spring 10.4 summer 7.5 fall 11.8 1981 winter 9.5 spring 12.2 summer 8.8 fall 13.6 1982 winter 10.4 spring 13.5 summer 9.7 fall 13.1 1983 winter 9.5 spring 11.7 summer 8.4 fall 12.9 1984 winter 10.9 spring 13.7 summer 10.1 fall 15.0 6

4. (1) Show that the AR(1) model can be written as Y t = ϕ 3 Y t 3 + ϕ 2 ϵ t 2 + ϕϵ t 1 + ϵ t. 5. (1) Show that the MA(1) model can be written as Y t = θy t 1 θ 2 Y t 2 θ 3 Y t 3 θ 4 ϵ t 4 +ϵ t. 6. (3) Rewrite BY t, and 3 B 2 Y t in terms of Y t j, j = 0, 1, 2,.... 7

Assignment 6: Causal modeling with ARIMA models 1. (4) The data set g:/haas t/busadm713/occup.dat contains data on occupancy (number of rooms) of a hotel per week. The first column is number of rooms and the second column is week number. Do the following: (a) Regress occupancy onto week number and compute the raw residuals (y ŷ). (b) Compute the ACF and PACF of these residuals. If there is strong autocorrelation at lags greater than about 10 weeks, recompute the ACF and PACF on the lag-one differenced residuals. (c) Use either the raw residual or differenced residual plots to select values for p and q and fit an ARIMA model to these residuals. If the raw residuals are used, specify d = 0. If the differenced residuals are used, specify d = 1. (d) Predict room occupancy for the week that follows the last week in the data set by adding the regression model prediction for that week to the ARIMA model s predicted value of the regression residual for that week. (e) Create all relevant plots. 2. (2) Use the result (1 X) 1 = 1 + X + X 2 + X 3 +... (BD:40) to derive the equation that is just before the line The pattern of weights... on BD:183. 3. (2) Give the transfer function expression to use in the input statement of proc arima for the model (Y t µ Y ) δ 1 (Y t 1 µ Y ) δ 2 (Y t 2 µ Y ) = C((X t 2 µ X ) ϕ(x t 5 µ X )) + η t 4. (2) What are the rules for using cross-correlations to specify the transfer function? 8

Assignment 7: Nonconstant Variance in ARIMA Models 1. (6) Find a GARCH model for the IBM price data such that the time series residuals pass the Q and LM tests for ARCH disturbances. See g:/haas t/busadm713/ibmgarch.sas for the data. To do this: (a) Fit a model, say model price = t. (b) Let r be the time series residuals, sr be the standardized residuals, and v be the conditional variances. Use a data step to compute sr = r/ v after using an output statement to store the r and v variables in a SAS data set. (c) Fit a model to sr and request nonconstant error variance tests with archtest. Experiment with different GARCH models (see the TYPE option in proc autoreg s model statement) for the residuals of price until the nonconstant error variance tests on sr s residuals do not reject the hypothesis of constant variance. (d) Among the models that successfully model the nonconstant error variance, which one do you select? Why? 2. (4) Write down the mathematical form of every GARCH, EGARCH, IGARCH, and GARCH-M model for nonconstant error variance that is available in proc autoreg. In each case, identify what parameters need to be estimated. 9

Assignment 8: Multivariate ARIMA Models 1. (10) Use proc varmax to fit a trivariate model to sales, mort, and starts in the build.dat data set (see g:/haas t/busadm713/build.dat). Select a VARMAX model by minimizing the true one-step-ahead sales MAPE, i.e., for each forecast, only data older than the forecast time is used to fit the VARMAX model (see varmpe.sas on the g: drive). Now, fit a univariate ARIMA model to sales also by minimizing the true one-step-ahead MAPE. Is the multivariate model s one-step-ahead sales MAPE smaller than the univariate model s? 10

Assignment 9: Combining Forecasts, State Space Models 1. (5) Use the inter-bank loan data in BD to form a composite forecast with regressionderived weights from a naive forecast, a proc autoreg forecast, and a proc reg forecast. 2. (5) Construct a data set of three observations such that the unweighted composite forecast composed of a naive forecast and an SES forecast has a MAPE that is larger than the naive forecast but smaller than the SES forecast. What value of α in the SES forecast did you use to achieve this result? 11

Assignment 10: Sensing, Shaping, and Linking Demand to Supply 1. (1) In addition to price, identify one more management-controllable variable that can be used to predict consumer demand for the Harley-Davidson Dyna Switchback motorcycle. 2. (1) Other than demand, identify two management-controllable variables that could be used to predict the supply of these motorcycles. Also identify a plausible fixed value for the cost to build one such motorcycle. 3. (2) Simulate 24 months of observations on demand and supply under different settings of these five predictor variables. Use a plausible value for l d (consumer demand forward lag). Justify this value. When simulating, make the predictor variables affect demand and supply in appropriate directions and magnitudes. 4. (2) Fit the two-tiered MTCA model by first fitting the demand model and then computing l d -step-ahead forecasts at each month for use as input to the supply model. Now, fit the supply model. 5. (2) Perform a what-if analysis on the fitted demand model and prepare a short paragraph of advice to management based on this analysis. This is a shaping activity. 6. (2) By trying a few different strategies, find a price-volume combination that is projected to maximize profit over the next six months. 12

Assignment 11: Maximizing Profit Consider the problem of how to maximize an automotive repair shop s profit described in worksheet 9. 1. (10) Find optimal values on the controllable variables via stochastic programming. Consider only one time period into the future. Express demand forecasts as three valueprobability pairs as discussed in optlec.pdf. Note that because specials are integervalued, this is a mixed integer linear programming (MILP) problem. See hw11hint.sas. 13