Time Observations Time Period, t

Similar documents
Chapter 7 presents the beginning of inferential statistics. The two major activities of inferential statistics are

Chapter 14 : Statistical Inference 1. Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same.

Chapter Seven: Confidence Intervals and Sample Size

Statistics 13 Elementary Statistics

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

8.1 Estimation of the Mean and Proportion

Sampling and sampling distribution

Probability Models.S2 Discrete Random Variables

Chapter 8 Statistical Intervals for a Single Sample

Interval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems

The misleading nature of correlations

STAT Chapter 7: Confidence Intervals

TABLE OF CONTENTS - VOLUME 2

The Fallacy of Large Numbers

The Fallacy of Large Numbers and A Defense of Diversified Active Managers

Statistics for Business and Economics

Section 7-2 Estimating a Population Proportion

Confidence Intervals and Sample Size

STAT Chapter 6: Sampling Distributions

Tests for One Variance

μ: ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics

The Central Limit Theorem. Sec. 8.2: The Random Variable. it s Distribution. it s Distribution

Determining Sample Size. Slide 1 ˆ ˆ. p q n E = z α / 2. (solve for n by algebra) n = E 2

Rand Final Pop 2. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

Chapter 7. Confidence Intervals and Sample Sizes. Definition. Definition. Definition. Definition. Confidence Interval : CI. Point Estimate.

Much of what appears here comes from ideas presented in the book:

ELEMENTS OF MONTE CARLO SIMULATION

Confidence Intervals for the Difference Between Two Means with Tolerance Probability

STA Module 3B Discrete Random Variables

The Two-Sample Independent Sample t Test

Operations Research Models and Methods Paul A. Jensen and Jonathan F. Bard. Single Period Model with No Setup Cost

STA Rev. F Learning Objectives. What is a Random Variable? Module 5 Discrete Random Variables

CHAPTER 8. Confidence Interval Estimation Point and Interval Estimates

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

UNIVERSITY OF OSLO. Please make sure that your copy of the problem set is complete before you attempt to answer anything.

AMS7: WEEK 4. CLASS 3

VARIABILITY: Range Variance Standard Deviation

Chapter 6 Confidence Intervals Section 6-1 Confidence Intervals for the Mean (Large Samples) Estimating Population Parameters

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS

Estimating parameters 5.3 Confidence Intervals 5.4 Sample Variance

2. ANALYTICAL TOOLS. E(X) = P i X i = X (2.1) i=1

Chapter 3 - Lecture 5 The Binomial Probability Distribution

Martingales, Part II, with Exercise Due 9/21

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION

STA2601. Tutorial letter 105/2/2018. Applied Statistics II. Semester 2. Department of Statistics STA2601/105/2/2018 TRIAL EXAMINATION PAPER

The "bell-shaped" curve, or normal curve, is a probability distribution that describes many real-life situations.

Chapter 6.1 Confidence Intervals. Stat 226 Introduction to Business Statistics I. Chapter 6, Section 6.1

Mean of a Discrete Random variable. Suppose that X is a discrete random variable whose distribution is : :

Module 4: Point Estimation Statistics (OA3102)

The Simple Regression Model

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

5/5/2014 یادگیري ماشین. (Machine Learning) ارزیابی فرضیه ها دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی. Evaluating Hypothesis (بخش دوم)

MATH 264 Problem Homework I

1.017/1.010 Class 19 Analysis of Variance

4.1 Introduction Estimating a population mean The problem with estimating a population mean with a sample mean: an example...

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes. Standardizing normal distributions The Standard Normal Curve

Chapter 5. Continuous Random Variables and Probability Distributions. 5.1 Continuous Random Variables

Part 1 In which we meet the law of averages. The Law of Averages. The Expected Value & The Standard Error. Where Are We Going?

Simulation Wrap-up, Statistics COS 323

Shifting our focus. We were studying statistics (data, displays, sampling...) The next few lectures focus on probability (randomness) Why?

OPTIMAL PORTFOLIO CONTROL WITH TRADING STRATEGIES OF FINITE

1 Inferential Statistic

MLLunsford 1. Activity: Central Limit Theorem Theory and Computations

Previously, when making inferences about the population mean, μ, we were assuming the following simple conditions:

Chapter 7 Sampling Distributions and Point Estimation of Parameters

Chapter 8 Estimation

Econ 8602, Fall 2017 Homework 2

NORTH CAROLINA STATE UNIVERSITY Raleigh, North Carolina

Chapter 5. Statistical inference for Parametric Models

The following content is provided under a Creative Commons license. Your support

Homework Assignments

Statistics (This summary is for chapters 17, 28, 29 and section G of chapter 19)

Superiority by a Margin Tests for the Ratio of Two Proportions

A probability distribution shows the possible outcomes of an experiment and the probability of each of these outcomes.

Discrete Random Variables and Probability Distributions. Stat 4570/5570 Based on Devore s book (Ed 8)

Data Analysis and Statistical Methods Statistics 651

Linear Regression with One Regressor

STA 320 Fall Thursday, Dec 5. Sampling Distribution. STA Fall

Data Analysis and Statistical Methods Statistics 651

Lecture 22. Survey Sampling: an Overview

Stat 328, Summer 2005

. 13. The maximum error (margin of error) of the estimate for μ (based on known σ) is:

The Simple Regression Model

STAT Chapter 7: Central Limit Theorem

Probability. An intro for calculus students P= Figure 1: A normal integral

Sampling Distribution

For more information about how to cite these materials visit

CHAPTER 2 Describing Data: Numerical

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals

Statistics for Business and Economics

Value (x) probability Example A-2: Construct a histogram for population Ψ.

P1: TIX/XYZ P2: ABC JWST JWST075-Goos June 6, :57 Printer Name: Yet to Come. A simple comparative experiment

Lecture 8. The Binomial Distribution. Binomial Distribution. Binomial Distribution. Probability Distributions: Normal and Binomial

درس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی

6.041SC Probabilistic Systems Analysis and Applied Probability, Fall 2013 Transcript Lecture 23

1. Covariance between two variables X and Y is denoted by Cov(X, Y) and defined by. Cov(X, Y ) = E(X E(X))(Y E(Y ))

Section 0: Introduction and Review of Basic Concepts

INSTITUTE OF ACTUARIES OF INDIA EXAMINATIONS. 20 th May Subject CT3 Probability & Mathematical Statistics

FEEG6017 lecture: The normal distribution, estimation, confidence intervals. Markus Brede,

Option Pricing. Chapter Discrete Time

Transcription:

Operations Research Models and Methods Paul A. Jensen and Jonathan F. Bard Time Series and Forecasting.S1 Time Series Models An example of a time series for 25 periods is plotted in Fig. 1 from the numerical data in the table below. The data might represent the weekly demand for some product. We use x to indicate an observation and t to represent the index of the time period. For the case of weekly demand the time period is measured in weeks. The observed demand for time t is specifically designated x t. The lines connecting the observations on the figure are provided only to clarify the picture and otherwise have no meaning. Time Observations 1-10 4 16 12 25 13 12 4 8 9 14 11-20 3 14 14 20 7 9 6 11 3 11 20-25 8 7 2 8 8 10 7 16 9 4 25 20 x 15 10 5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Time Period, t Figure 1. A time series of weekly demand Mathematical Model Our goal is to determine a model that explains the observed data and allows extrapolation into the future to provide a forecast. The simplest model 9/10/01 Inventory Theory

Time Series and Forecasting 2 suggests that the time series is a constant with variations about the constant value determined by a random variable t. X t = b + t. (1) The upper case X t represents the random variable that is the unknown demand at time t, while the lower case x t is a value that has actually been observed. The random variation about the mean value called the noise, t. The noise is assumed to have a mean value of zero, a given variance, and the variations in two different time periods are independent. Specifically E( t ) = 0, Var( t ) = 2, E( t w ) = 0 for t w. A more complex model includes a linear trend for the data. X t = b 0 + b 1 t + t. (2) Of course (1) and (2) are special cases of a polynomial model. X t = b 0 + b 1 t + b 2 t 2 + + b n t n + t. A model for a seasonal variation might include transcendental functions. The cycle of the model below is 4. The model might be used to represent data for the four seasons of the year. X t = b 0 + b 1 sin 2 t 4 + b 1 cos 2 t 4 + t. In every model considered here, the time series is a function only of time and the parameters of the models. We can write X t = f(b 0, b 1, b 2,,b n, t) + t. Since for any given time the value of f is a constant and the expected value of t is zero, E(X t ) = f(b 0, b 1, b 2,,b n, t) and Var(X t ) =V( t ) = 2. The model supposes that there are two components of variability for the time series; the mean value varies with time, and the difference from the mean varies randomly. Time is the only factor affecting the mean value, while all other factors are described by the noise component. Of course, these assumptions may not in fact be true, but this chapter is devoted to cases that can be abstracted to this simple form with reasonable accuracy. One of the problems of time series analysis is to find the best form of the model for a particular situation. In this introductory discussion we are primarily concerned about the simple constant or trend models. We leave the problem of choosing the best model to a more advanced text. In the following paragraphs we describe methods for fitting the model, forecasting from the model, measuring the accuracy of the forecast and

Time Series and Forecasting 3 forecasting ranges. We illustrate the discussion of this section the moving average forecasting method. Several other methods are described later in the chapter. Fitting Parameters of the Model Once a model is selected and data is available, it is the job of the statistician to estimate its parameters, to find parameter values that best fit the historical data. We can only hope that the resulting model will provide good predictions of future observations. Statisticians usually assume all values in a given sample are equally valid. For time series however, most methods recognize that data from recent times are more accurate than data from times well in the past. Influences governing the data probably change with time and a method should have the capability of neglecting old data while favoring new. A model estimate should be able to change over time to reflect changing conditions. In the following the time series model includes one or more parameters. We identify the estimated values of these parameters with hats on the parameter notation. For instance ^b 1, ^b 2,, ^b n. The procedures also provide estimates of the standard deviation of the noise. Again the estimate is indicated with a hat, ^ We will see that there are several approaches available for estimating e. To illustrate these concepts consider the data in Table 1. Say that the statistician has just observed the demand in period 20. She also has available the demands for periods 1 through 19. She cannot know the future, so the information shown as 21 through 25 is not available. The statistician thinks that the factors that influence demand are changing very slowly, if at all, and proposes the simple constant model for the demand X t = b + t. (1) With the assumed model, the values of demand are random variables drawn from a population with mean value b. The best estimator of b is the average of the observed data. Using all 20 points the estimate is 20 ^b = x t /20 = 11.3. t=1 This is the best estimate for the 20 data points, however, we note that x 1 is given the same weight as x 20 in the computation. If we think that the model is actually changing over time, perhaps it is better to use a method that

Time Series and Forecasting 4 gives less weight to old data and more to the new. One possibility is to include only later data in the estimate. Using the last ten observations and the last five we obtain 20 ^b = x t /10 = 11.2 and ^b 20 = x t /5 = 9.4. t=10 t=15 The latter two estimates are called moving averages. Which is the better estimate for the application? We really can't tell at this point. The estimator that uses all data points will certainly be the best if the time series follows the assumed model, however, if the model is only approximate and the situation is actually changing, perhaps the estimator with only five data points is better. In general, the moving average estimator is the average of the last m observations. t ^b = x i /m, i=k where k = t - m + 1. The quantity m is the time range and is the parameter of the method. Forecasting from the Model The purpose of modeling a time series is usually to make forecasts of the future. The forecasts are used directly for making decisions such as ordering replenishments for an inventory or staffing workers for production. They might also be used as part of a mathematical model for a more complex decision analysis. The current time is T, and the data for the actual demands for times 1 through T are known. Say we are attempting to forecast the demand at time T +. The unknown demand is the random variable X T+, and its ultimate realization is x T+. Our forecast of the realization is ^x T+. Of course the best that we can hope to do is estimate the mean value of X T+. Even if the time series actually follows the assumed model, the future value of the noise is unknowable. Assuming the model is correct X T+ = E(X T+ ) + t where E(X T+ ) = f(b 0, b 1, b 2,,b n, T+ ).

Time Series and Forecasting 5 When we estimate the parameters from the data for times 1 through T, we have an estimate of the expected value for the random variable as a function of.. This is our forecast. ^x T+ = f( ^b 0, ^b 1, ^b 2,, ^b n, T+ ). Using a specific value of in this formula provides the forecast for time T+. When we look at the last T observations as only one of the possible time series that could have been obtained from the model, the forecast is a random variable. We should be able to describe the probability distribution of the random variable, including its mean and variance. For the moving average example, the statistician adopts the model X t = b + t. Assuming T is 20 and using the moving average with ten periods, the estimated parameter is ^b = 11.2. Since this model has a constant expected value over time, the forecast is the same for all future periods. ^x T+ = ^b = 11.2 for. = 1, 2, Assuming the model is correct, the forecast is the average of m observations all with the same mean and standard deviation,. Since the noise is Normally distributed, the forecast is also Normally distributed with mean b and standard deviation m Measuring the Accuracy of the Forecast The error in a forecast is the difference between the realization and the forecast, Assuming the model is correct, e = x T+ ^x T+. e = E(X T+ ) + t ^x T+. We investigate the probability distribution of the error by computing its mean and variance. One desirable characteristic of the forecast ^x T+ is that it be unbiased. For an unbiased estimate, the expected value of the forecast is the

Time Series and Forecasting 6 same as the expected value of the time series. Since t is assumed to have a mean of zero, then for an unbiased forecast, E(e ) = 0. Because the noise at any give time is independent of the noise at any other time, the variance of the error is Var(e ) = Var[E(X T+ ) ^x T+ ] + Var( T+ ) 2 ( ) = Ε 2 ( ) + 2. The variance of the error has two parts, that due to the variance in the estimate of the mean, Ε 2 ( ), and that due to the variance of the noise, 2. Due to the inherent inaccuracy of the statistical methods used to estimate the model parameters and the possibility that the model is not exactly correct, the variance in the estimate of the means is an increasing function of. For the example of the moving average, 2 ( ) = 2 m + 2 = 2 [1 + (1/m)]. The error variance is a function of m and decreases as m increases. Obviously the smallest error comes when m is as large as possible, if the model is correct. Unfortunately, we cannot be sure that the model is correct, and we set m to smaller values to reduce the error due to errors in the model. Using the same forecasting method over a number of periods allows the analyst to compute measures of quality for the forecast for given values of.. The forecast error, e t, is the difference between the forecast and the observed value. For time t, e t = x t ^x t. Table 2 shows a series of forecasts for periods 11 through 20 using the data from Table 1. The forecasts are obtained with a moving average using m equal to 10 and. equal to 1. We make a forecast at time t with the calculation t ^x t+1 = x t /10. k=t 9 Although in practice one might round the result to an integer, we keep fractions here to observe better statistical properties. The error of the forecast is the difference between the forecast and the observation.

Time Series and Forecasting 7 Table 2. Forecast Error for a Moving Average Time 11 12 13 14 15 16 17 18 19 20 Observation 3 14 14 20 7 9 6 11 3 11 Forecast 11.7 11.6 11.4 11.6 11.1 10.5 10.2 10.4 10.7 10.1 Error -8.7 2.4 2.6 8.4-4.1-1.5-4.2 0.6-7.7 0.9 One common measure of forecasting error is the mean absolute deviation, MAD. n e i i=1 MAD = n where n error observations are used to compute the mean. The sample standard deviation of error is also a useful measure n (e i e) 2 n e i 2 n( e) 2 i=1 i=1 s e = n p = n p where e is the average error, and p is the number of parameters estimated for the model. As n grows the MAD provides a reasonable estimate of the sample standard deviation s e 1.25 MAD From the example data we compute the MAD for the ten observations. MAD = (8.7 + 2.4 + + 0.9)/10 = 4.11. To compute the sample error standard deviation. e = ( 8.7 + 2.4 0.9)/10 = -1.13. s e 2 = ( 8.72 + 2.4 2 0.9 2 ) 10(-1.13) 2 9 ) = 27.02 s e = 5.198. We see that 1.25(MAD) = 5.138 is approximately equal to the sample standard deviation. Since it is easier to compute the MAD, this measure is used in our examples. 1 1 The time series used as an example is simulated with a constant mean. Deviations from the mean are Normally distributed with mean zero and standard deviation 5. One would expect an error standard deviation of 5( 1+1/10 ) = 5.244. The observed statistics are not far from this value. Of course, a different realization of the simulation will yield different statistical values.

Time Series and Forecasting 8 variance, The value of s e 2 for a given value of. is an estimate of the error 2 ( ). It includes the combined effects of errors in the model and the noise. If one assumes that the random noise comes from a Normal Distribution, an interval estimate of the forecast can be computed using the Students t distribution. ^x T+.± t α/2 s e ( ) where t α/2 is found in a Students t distribution table with n p degrees of freedom.