What Practitionors Nood to Know...

Similar documents
Introduction to Population Modeling

Econometric Methods for Valuation Analysis

Impact of Unemployment and GDP on Inflation: Imperial study of Pakistan s Economy

Business Statistics: A First Course

UNIT 16 BREAK EVEN ANALYSIS

Linear regression model

ECO 2013: Macroeconomics Valencia Community College

Final Term Papers. Fall 2009 (Session 03a) ECO401. (Group is not responsible for any solved content) Subscribe to VU SMS Alert Service

LINES AND SLOPES. Required concepts for the courses : Micro economic analysis, Managerial economy.

Chapter 14. Descriptive Methods in Regression and Correlation. Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 14, Slide 1

AP Statistics Chapter 6 - Random Variables

Professor Christina Romer SUGGESTED ANSWERS TO PROBLEM SET 5

SJAM MPM 1D Unit 5 Day 13

THE RELATIONSHIP BETWEEN GDP GROWTH RATE AND INFLATIONARY RATE IN GHANA: AN ELEMENTARY STATISTICAL APPROACH

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1

Models of Patterns. Lecture 3, SMMD 2005 Bob Stine

REGIONAL WORKSHOP ON TRAFFIC FORECASTING AND ECONOMIC PLANNING

The Brattle Group 1 st Floor 198 High Holborn London WC1V 7BD

Statistics 101: Section L - Laboratory 6

Web Extension: Continuous Distributions and Estimating Beta with a Calculator

Correlation vs. Trends in Portfolio Management: A Common Misinterpretation

Glossary of Budgeting and Planning Terms

Stock Price Sensitivity

Topic 4: AS-AD Model Dealing with longer run; more variance; look at the role of wages and prices

Professor Christina Romer SUGGESTED ANSWERS TO PROBLEM SET 5

Statistical Models of Stocks and Bonds. Zachary D Easterling: Department of Economics. The University of Akron

Stat3011: Solution of Midterm Exam One

We will make several assumptions about these preferences:

Diploma in Business Administration Part 2. Quantitative Methods. Examiner s Suggested Answers

Multiple Regression. Review of Regression with One Predictor

CHAPTER 8: INDEX MODELS

SEX DISCRIMINATION PROBLEM

Chapter 3: Cost-Volume-Profit Analysis (CVP)

AP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE

False_ The average revenue of a firm can be increasing in the firm s output.

Assignment 2 (part 1) Deadline: September 30, 2004

CHAPTER 2 RISK AND RETURN: Part I

VIX Fear of What? October 13, Research Note. Summary. Introduction

LESSON - 23 THE SAVING FUNCTOIN. Learning outcomes

The Simple Regression Model

Unit 3: Writing Equations Chapter Review

Copyright 2011 Pearson Education, Inc. Publishing as Addison-Wesley.

INTERNATIONAL JOURNAL OF MANAGEMENT (IJM)

STATISTICAL DISTRIBUTIONS AND THE CALCULATOR

DATA SUMMARIZATION AND VISUALIZATION

Game Theory and Economics Prof. Dr. Debarshi Das Department of Humanities and Social Sciences Indian Institute of Technology, Guwahati

Forecasting Chapter 14

Econometrics and Economic Data

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

(i.e. the rate of change of y with respect to x)

File: ch08, Chapter 8: Cost Curves. Multiple Choice

Dividend Policy and Stock Price to the Company Value in Pharmaceutical Company s Sub Sector Listed in Indonesia Stock Exchange

CHAPTER 2 RISK AND RETURN: PART I

rise m x run The slope is a ratio of how y changes as x changes: Lines and Linear Modeling POINT-SLOPE form: y y1 m( x

Chapter 12 Consumption, Real GDP, and the Multiplier

E&G, Ch. 1: Theory of Choice; Utility Analysis - Certainty

starting on 5/1/1953 up until 2/1/2017.

The Evidence for Differences in Risk for Fixed vs Mobile Telecoms For the Office of Communications (Ofcom)

Key Idea: We consider labor market, goods market and money market simultaneously.

A Study of Relationship between Accruals and Managerial Operating Decisions over Firm Life Cycle among Listed Firms in Tehran Stock Exchange

The Simple Regression Model

Archana Khetan 05/09/ MAFA (CA Final) - Portfolio Management

Impact of Devaluation on Trade Balance in Pakistan

So far in the short-run analysis we have ignored the wage and price (we assume they are fixed).

Chapter 2 Portfolio Management and the Capital Asset Pricing Model

Compensating for Missing ERISA Information in Calculating Private Market Per Capita Costs

d) What is the slope? Interpret in the context of the problem.

Lecture 5: Fundamentals of Statistical Analysis and Distributions Derived from Normal Distributions

Improving Returns-Based Style Analysis

Statistical Evidence and Inference

WEB APPENDIX 8A 7.1 ( 8.9)

Algebra 1 Unit 3: Writing Equations

Appendix A (Pornprasertmanit & Little, in press) Mathematical Proof

Rand Final Pop 2. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

A STATISTICAL ANALYSIS OF GDP AND FINAL CONSUMPTION USING SIMPLE LINEAR REGRESSION. THE CASE OF ROMANIA

When determining but for sales in a commercial damages case,

University of New South Wales Semester 1, Economics 4201 and Homework #2 Due on Tuesday 3/29 (20% penalty per day late)

download instant at

0 $50 $0 $5 $-5 $50 $35 1 $50 $50 $40 $10 $50 $15 2 $50 $100 $55 $45 $50 $35 3 $50 $150 $90 $60 $50 $55 4 $50 $200 $145 $55 $65

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Effect of Data Collection Period Length on Marginal Cost Models for Heavy Equipment

Risk Analysis. å To change Benchmark tickers:

Measuring Unintended Indexing in Sector ETF Portfolios

Diploma Part 2. Quantitative Methods. Examiner s Suggested Answers

York University. Suggested Solutions

Chapter 16. Managing Bond Portfolios

KOÇ UNIVERSITY ECON 202 Macroeconomics Fall Problem Set VI C = (Y T) I = 380 G = 400 T = 0.20Y Y = C + I + G.

Jacob: The illustrative worksheet shows the values of the simulation parameters in the upper left section (Cells D5:F10). Is this for documentation?

Booth School of Business, University of Chicago Business 41202, Spring Quarter 2014, Mr. Ruey S. Tsay. Solutions to Midterm

σ e, which will be large when prediction errors are Linear regression model

Learning Objectives = = where X i is the i t h outcome of a decision, p i is the probability of the i t h

Lecture 2: Fundamentals of meanvariance

Impact of Terrorism on Foreign Direct Investment in Pakistan

The data definition file provided by the authors is reproduced below: Obs: 1500 home sales in Stockton, CA from Oct 1, 1996 to Nov 30, 1998

Window Width Selection for L 2 Adjusted Quantile Regression

Price Determination under Perfect Competition

Making Hard Decision. ENCE 627 Decision Analysis for Engineering. Identify the decision situation and understand objectives. Identify alternatives

APPLICATIONS OF STATISTICAL DATA MINING METHODS

Week 4 and Week 5 Handout Financial Statement Analysis

CHAPTER 2 THEORETICAL FOUNDATION. Bank is one of a well-known financial institution in Indonesia. In general,

Transcription:

What Practitionors Nood to Know... by Mark Kritzman How can we predict uncertain outcomes? We could study the relations between the uncertain variable to be predicted and some known variable. Suppose, for example, that we had to predict the change in profits for the airline industry. We might exjject to find a relation between GNP growth in the current period and airline profits in the subsequent period, because economic growth usually foreshadows business travel as well as personal travel. We can quantify this relation through a technique known as regression analysis. Regression analysis can be traced to Sir Francis Galton (1822-1911), an English scientist and anthropologist who was interested in determining whether or not a son's height corresponded to his father's height. To answer this question, Galton measured a sample of fathers and computed their average height. He then measured their sons and computed their average height. He found that fathers of aboveaverage height had sons whose heights tended to exceed the average, Galton termed this phenomenon "regression toward the mean." Simple Lineair Regression To measure the relation b)etween a single independent variable (GNP growth, in our earlier example) and a dependent variable (subsequent change in airline profits), we can begin by gathering some data on each variable for example, actual GNP growth in each quarter of a given sample period and the change in the airline industiy's profit over each subsequent quarter. We can then plot the intersects of these observations. The result is a scatter diagram such as the one shown in Figure A. The horizontal axis represents a quarter's GNP growth and the vertical axis represents the percentage change in profits for the airline industry in the subsequent quarter. The plotted points in the figure indicate the actual percentage change in airline profits associated with a given level of GNP growth. They suggest a positive relation; that is, as GNP increases so do airline profits. The straight line sloping upward from left to right measures this relation. This straight line is called the regression line. It has been fitted to the data in such a way that the sum of the squared differences of the observed airline profits from the values along the line is minimized. The values along the regression line corresponding to the vertical axis represent the predicted change in airline profits given the corresponding prior quarter's GNP... About Regressions growth along the horizontal axis. The difference between a value predicted by the regression line and the actual change in airline profits is the error, or the residual. Given a particular value for GNP growth, we can predict airline profits in the subsequent quarter by multiplying the GNP growth value by the slope of the regression line and adding to this value the intercept of the line with the vertical axis. The equation is: Here 'Y^ equals the predicted percentage change in airline profits, a equals the intercept of the regression line with the vertical axis, /3 equals the slope of the regression line and X, equals the prior quarter's growth in GNP. We can write the equation for the actual percentage change in jiirline profits, given our observation of the prior quarter's GNP growth, by adding the error to the prediction equation: Yi=a + p Xi + e,. Here Y, equals the actual percentage change in airline profits and ej equals the error associated with the predicted value. Positive errors indicate that the regression equation underestimated the dependent variable (airline profits) for a particular value of the independent variable (GNP growth), while negative errors indicate that the regression equation overestimated the dependent variable. Figure B illustrates these notions. Analysis of Variance To determine whether or not our regression equation is a good predictor of the dep>endent variable, we can Figure A tagecune Pro c ^ Ic Scatter Diagram '^ ^ ^ GNP Growth (t-1) FINANCIAL ANALYSTS JOURNAL / MAY-JUNE 1991 D 12

Figure B "I ec 01 Sis fck &.B Regression Model GNP Growth (t-1) Start by performing an analysis of variance. This involves dividing the variation in the dependent variable (change in airline profits) into two parts that explained by variation in the independent variable (prior quarter's GNP growth) and that attributable to error. In order to proceed, we must first calculate three values the total sum of the squares, the sum of the squares due to regression and the sum of the squares due to error. The total sum of the squares is calculated as the sum of the squared differences between the observed values for the dependent variable and the average of those observations. The sum of the squares due to regression is calculated as the sum of the squared differences between the predicted values for the dependent variable and the average of the observed values for the dependent variable. Finally, the sum of the squares due to error is calculated as the sum of the squared differences between the observed values for the dependent variable and the predicted values for the dependent variable. The ratio of the sum of the squares due to regression to the total sum of the squares equals the fraction of variation in the dependent variable that can be explained by variation in the independent variable. It is referred to as R-squared (R^), or the coefficient of determination. It ranges in value from 0 to 1. A high value for R-squared indicates a strong relation between the dependent and independent variables, whereas a low value for R-squared indicates a weak relation.* The square root of R-squared is called the correlation coefficient. It measures the strength of the association between the dependent and independent variables. In the case of an inverse relation that is, where the slope of the regression line is negative we must adjust the sign of the correlation coefficient to accord with the slope of the regression line. The correlation coefficient ranges in value from -1 to +1. Residual Analysis R-squared is only a first approximation of the validity of the relation between the dependent and independent variables. Its validity rests on several assump- tions: (1) the independent variable (GNP growth in the example) must be measured without error; (2) the relation between the def)endent and independent variables must be linear (as indicated by the regression line); (3) the errors, or residuals, must have constant variance (that is, or they must not increase or decrease with the level of the independent variable); (4) the residuals must be independent of each other; and (5) the residuals must he normally distributed. Unless these assumptions are true, the measured relation Ijetween the dependent and independent variables, even if it has a high R-squared, may be spurious. The importance of the first assumption is selfevident and should not require elaboration. The importance of some of the remaining assumptions may require some elaboration. In order to analyze the residuals, it is convenient to standardize each residual by dividing it by the standard error. ^ We can then plot the residuals to determine whether or not the above assumptions are satisfied. Figure C shows a plot of standardized residuals. These seem to trace a convex curve. The errors associated with low values of the independent variable are positive; but they become increasingly negative with higher levels of the independent variable and then become positive again as the independent variable increases still more. In this case, it is apparent that the relation between the dependent and indep>endent variables violates the assumption of linearity. The dependent variable increases with the independent variable but at a decreasing rate. That is to say, the independent variable has less and less effect on the dependent variable. This pattern is characteristic of the relation t>etween the level of advertising expenditures and sales, for example. Suppose a company distributes a product in several regions, and it varies the level of advertising expenditures across these regions to measure advertising's effect. The company will likely observe higher sales in a region where it advertises a little than in a region where it does not advertise at au. And as advertising increases from region to region, corresponding sales should also increase. At some level of Figure C I 0 Non-Linearity 1. Footnotes appear at the end of article. HNANCIAL ANALYSTS JOURNAL / MAY-JUNE 1991 D 13

Figure D Heteroscedasticity test for first-order autocorrelation (correlation between successive residuals) by calculating a Durbin- Watson statistic. The Durbin-Watson statistic is approximately equal to 2(1 - R), where R equals the correlation coefficient measuring the association between successive residuals. As the Durbin-Watson statistic approaches 2, we should become more confident that the residuals are independent of each other (at least successively). Depending on the number of variables and numljer of observations, we can determine our level of confidence specifically. With economic and financial data, it is often useful to transform the data into percentage changes, or first differences. This reduces autocorrelation. sales, however, a region will start to become saturated with the product; additional advertising expenditures will have less and less impact on sales. The obvious problem with using a linear model when the independent variable has a diminishing effect on the dependent variable is that it will overestimate the dependent variable at high levels of the independent variable. In nnany instances, we can correct this problem by transforming the values of the independent variable into their reciprocals and then performing a linear regression of the dependent variable on these reciprocals. Figure D illustrates a case in which the absolute values of the standardized residuals increase as the values for the independent variable increase. In this case, the errors involved in predicting the dependent variable will grow larger and larger, the higher the value of the independent variable. Our predictions are subject to larger and larger errors. This problem is known as heteroscedasticity. It can often be ameliorated by transforming the independent variables into their logarithmic values. Figure E shows a plot in which all the standardized residuals are positive with the exception of a single very large negative residual. This large negative residual is called an outlier, and it usually indicates a specious observation or an event that is not likely to recur. If we had included GNP growth in the last quarter of 1990 as one of the observations used to predict airline profitability, for example, we would have grossly overestimated airline profits in the first quarter of 1991; both business and personal air travel dropped precipitously in early 1991 t)ecause of the threat of terrorism stemming from the Gulf War. In this case, we would simply eliminate the outlying observation and rerun the regression with the remaining data. In all these examples, the residuals are in violation of the independence assumption. That is, the plotted points in Figures C, D and E form patterns, rather than random distributions. This suggests that the residuals are not independent of one another but are correlated with one another, or autocorrelated. Without exanuning the residuals explicitly, we can Multiple Linear Regression We have so far focused on simple linear regressions that is, regressions between a dependent variable and a single independent variable. In many instances, variation in a dependent variable can be explained by variation in several independent variables. RetiuTiing to our example of airline profits, we may wish to include changes in energy prices as a second independent variable, given the relatively high operating leverage associated with the airline industry. We can express this multiple regression equation as follows: Y, = a + j8i Xii + ^2 Xi2. Here Xji and Xj2 equal the two independent variables (GNP growth and changes in energy prices) and ft and jsj equal their coefficients. It seems reasonable to expect that as fuel prices rise, profit margins in the airline industry will fall and vice versa. This would mean a negative relation tjetween airline profits and energy prices. Thus /Sj would be a negative value. But an increase in economic activity could increase demand for energy and contribute to a rise in energy prices. Thus the two independent variables, GNP growth and changes in energy prices, may not be independent of each other. This problem is known as multicolinearity. Suppose we run two simple linear regressions using two independent variables. If the variables are Figure E 1 0 Outlier FINANCIAL ANALYSTS JOURNAL / MAY-JUNE 1991 D14

independent of each other, then the sum of the R-squares from the two regressions will equal the R-squared from a multiple linear regression combining the two variables. To the extent that the independent variables are correlated with each other, however, the R-squared from the multiple regression will be less than the sum from the two simple regressions. When the independent variables in a multiple regression are colinear, we must take care in interpreting their coefficients. The coefficients jsj and P2 ^ the above equation represent the marginal sensitivity of a change in airline profits to a one-unit change in GNP growth and to a one-unit change in energy prices in the prior quarter. If /3i equals 0.7 per cent and /32 equals 0.15 per cent, for example, we would expect airline profits to increase by 0.7 per cent if CNP grew 1 per cent in the prior quarter and energy prices rennained constant. If energy prices increased by 1 per cent in the prior quarter and GNP remained constant, we would expect airline profits to decrease by 0.15 per cent. To the extent there is multicolinearity between the indepjendent variables, these responses would not equal the sensitivity of airline profits to the same independent variables as measured by simple linear regressions. Regression analysis is a powerful tool for the financial analyst. But, as we have attempted to demonstrate, the summary statistics from regression analysis can be misleading. Footnotes 1. As part of their output, most regression packages include measures of statistical significance such as an F-value and a t-statistic. The F-value is computed as the ratio of the sum of the squares due to regression (adjusted by the degrees of freedom) to the sum of the squares due to error (also adjusted by the degrees of freedom). Its significance depends on the number of variables and observations. The t-statistic measures the significance of the coefficients of the independent variables. It is computed as the ratio of the coefficient to the standard error of the coefficient. The F-test and the t-test are the same for simple linear regressions, but not necessarily for multiple linear regressions, 2. The standard error measures the dispersion of the residuals around the regression line. It is calculated as the square root of the average squared differences of the observed values from the values predicted by the regression line. To estimate the average of the squared differences, we divide the sum of the squared differences by the number of observations less one. From the Board concluded from page 8. toward shared risk-reward pension models are more piersuasive explanations for the persistence of 60/40 pension fund investment policies. Congress And the FASB Bodie is on the mark when he points to the central roles Congress and the Financial Accounting Standards Board (FASB) have played in promoting fuzzy thinking alraut pension finance and investments. The ABO now plays a central role in Congress' funding rules and in the FASB's financial disclosure rules. No doubt these developments have given this liability measure a great deal more respectability than it deserves. Though he didn't, Bodie could also have fingered the enshrinement of pension assets in ERISA as trust assets through the "exclusive benefit rule." Both this rule and the ABO concept pull pension thinking away from legitimate pension models (e.g., the integrated finance, pure defined benefit model or some form of a fair shared risk-reward model). Instead, they push pension fiduciaries to contemplate perverse, unstable pension models such as the one described by Bodie (i.e., unshared risk-shared reward arrangements, which lead to stakeholder gaming). Three Important Lessons In the end, there are three important lessons in this tale of the two investment policies and their origins. These lessons apply as much to public-sector pension funds as they do to corporate funds. 1. There can be no clearly defined investment {wlicies without dearly defined pension deals. 2. Perverse legal and accounting pension rules promote perverse pension deals. 3. Perverse pension deals in turn promote perverse investment jx)licies. FINANCIAL ANALYSTS JOURNAL / MAY-JUNE 1991 D 15