MgtOp 215 Chapter 13 Dr. Ahn

Similar documents
Which of the following provides the most reasonable approximation to the least squares regression line? (a) y=50+10x (b) Y=50+x (d) Y=1+50x

Tests for Two Correlations

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE)

CHAPTER 9 FUNCTIONAL FORMS OF REGRESSION MODELS

Chapter 3 Descriptive Statistics: Numerical Measures Part B

Module Contact: Dr P Moffatt, ECO Copyright of the University of East Anglia Version 2

II. Random Variables. Variable Types. Variables Map Outcomes to Numbers

Chapter 5 Student Lecture Notes 5-1

Notes are not permitted in this examination. Do not turn over until you are told to do so by the Invigilator.

Simple Regression Theory II 2010 Samuel L. Baker

Spatial Variations in Covariates on Marriage and Marital Fertility: Geographically Weighted Regression Analyses in Japan

Mode is the value which occurs most frequency. The mode may not exist, and even if it does, it may not be unique.

σ may be counterbalanced by a larger

3/3/2014. CDS M Phil Econometrics. Vijayamohanan Pillai N. Truncated standard normal distribution for a = 0.5, 0, and 0.5. CDS Mphil Econometrics

Analysis of Variance and Design of Experiments-II

Evaluating Performance

Calibration Methods: Regression & Correlation. Calibration Methods: Regression & Correlation

Linear Combinations of Random Variables and Sampling (100 points)

Chapter 3 Student Lecture Notes 3-1

Graphical Methods for Survival Distribution Fitting

ISyE 512 Chapter 9. CUSUM and EWMA Control Charts. Instructor: Prof. Kaibo Liu. Department of Industrial and Systems Engineering UW-Madison

Data Mining Linear and Logistic Regression

A Comparison of Statistical Methods in Interrupted Time Series Analysis to Estimate an Intervention Effect

THE VOLATILITY OF EQUITY MUTUAL FUND RETURNS

Multifactor Term Structure Models

Random Variables. b 2.

Measures of Spread IQR and Deviation. For exam X, calculate the mean, median and mode. For exam Y, calculate the mean, median and mode.

3: Central Limit Theorem, Systematic Errors

Sampling Distributions of OLS Estimators of β 0 and β 1. Monte Carlo Simulations

CHAPTER 3: BAYESIAN DECISION THEORY

Midterm Exam. Use the end of month price data for the S&P 500 index in the table below to answer the following questions.

occurrence of a larger storm than our culvert or bridge is barely capable of handling? (what is The main question is: What is the possibility of

Tests for Two Ordered Categorical Variables

Capability Analysis. Chapter 255. Introduction. Capability Analysis

Natural Resources Data Analysis Lecture Notes Brian R. Mitchell. IV. Week 4: A. Goodness of fit testing

/ Computational Genomics. Normalization

Elton, Gruber, Brown and Goetzmann. Modern Portfolio Theory and Investment Analysis, 7th Edition. Solutions to Text Problems: Chapter 4

Risk and Return: The Security Markets Line

4. Greek Letters, Value-at-Risk

Solutions to Odd-Numbered End-of-Chapter Exercises: Chapter 12

Alternatives to Shewhart Charts

THE MARKET PORTFOLIO MAY BE MEAN-VARIANCE EFFICIENT AFTER ALL

UNIVERSITY OF VICTORIA Midterm June 6, 2018 Solutions

Principles of Finance

PASS Sample Size Software. :log

Introduction. Chapter 7 - An Introduction to Portfolio Management

Conditional Beta Capital Asset Pricing Model (CAPM) and Duration Dependence Tests

Numerical Analysis ECIV 3306 Chapter 6

THE MARKET PORTFOLIO MAY BE MEAN-VARIANCE EFFICIENT AFTER ALL

Elements of Economic Analysis II Lecture VI: Industry Supply

Appendix - Normally Distributed Admissible Choices are Optimal

Conditional beta capital asset pricing model (CAPM) and duration dependence tests

Transformation and Weighted Least Squares

Interval Estimation for a Linear Function of. Variances of Nonnormal Distributions. that Utilize the Kurtosis

Introduction to PGMs: Discrete Variables. Sargur Srihari

Scribe: Chris Berlind Date: Feb 1, 2010

Economic Design of Short-Run CSP-1 Plan Under Linear Inspection Cost

02_EBA2eSolutionsChapter2.pdf 02_EBA2e Case Soln Chapter2.pdf

Final Exam. 7. (10 points) Please state whether each of the following statements is true or false. No explanation needed.

Discounted Cash Flow (DCF) Analysis: What s Wrong With It And How To Fix It

Likelihood Fits. Craig Blocker Brandeis August 23, 2004

Random Variables. 8.1 What is a Random Variable? Announcements: Chapter 8

Financial Development and Economic Growth: Evidence from Heterogeneous Panel Data of Low Income Countries

Spurious Seasonal Patterns and Excess Smoothness in the BLS Local Area Unemployment Statistics

OCR Statistics 1 Working with data. Section 2: Measures of location

Physics 4A. Error Analysis or Experimental Uncertainty. Error

STAT 3014/3914. Semester 2 Applied Statistics Solution to Tutorial 12

Finance 402: Problem Set 1 Solutions

Correlations and Copulas

Domestic Savings and International Capital Flows

CrimeStat Version 3.3 Update Notes:

S yi a bx i cx yi a bx i cx 2 i =0. yi a bx i cx 2 i xi =0. yi a bx i cx 2 i x

Xiaoli Lu VA Cooperative Studies Program, Perry Point, MD

A Meta Analysis of Real Estate Fund Performance

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Dr. Wayne A. Taylor

Testing for Omitted Variables

Maximum Likelihood Estimation of Isotonic Normal Means with Unknown Variances*

Problem Set 6 Finance 1,

1 Omitted Variable Bias: Part I. 2 Omitted Variable Bias: Part II. The Baseline: SLR.1-4 hold, and our estimates are unbiased

arxiv: v1 [q-fin.pm] 13 Feb 2018

Notes on experimental uncertainties and their propagation

Does a Threshold Inflation Rate Exist? Quantile Inferences for Inflation and Its Variability

Teaching Note on Factor Model with a View --- A tutorial. This version: May 15, Prepared by Zhi Da *

2) In the medium-run/long-run, a decrease in the budget deficit will produce:

Global sensitivity analysis of credit risk portfolios

EDC Introduction

An Application of Alternative Weighting Matrix Collapsing Approaches for Improving Sample Estimates

Identification of climatic effect on crop yield of Marathwada region by using multiple linear regression & stochastics frontier approach

ANOVA Procedures for Multiple Linear Regression Model with Non-normal Error Distribution: A Quantile Function Distribution Approach

Introduction. Why One-Pass Statistics?

>1 indicates country i has a comparative advantage in production of j; the greater the index, the stronger the advantage. RCA 1 ij

Monetary Tightening Cycles and the Predictability of Economic Activity. by Tobias Adrian and Arturo Estrella * October 2006.

Risk and Returns of Commercial Real Estate: A Property Level Analysis

The Mack-Method and Analysis of Variability. Erasmus Gerigk

International ejournals

ECE 586GT: Problem Set 2: Problems and Solutions Uniqueness of Nash equilibria, zero sum games, evolutionary dynamics

Empirical Evidence on Spatial Contagion Between Financial Markets

Negative Binomial Regression Analysis And other count models

Loss Function Asymmetry and Forecast Optimality: Evidence from Individual Analysts' Forecasts

Number of women 0.15

Transcription:

MgtOp 5 Chapter 3 Dr Ahn Consder two random varables X and Y wth,,, In order to study the relatonshp between the two random varables, we need a numercal measure that descrbes the relatonshp The covarance between two random varables whch are observed n pars s such measure, and s defned as follows, Notet that mples, on average,, that s, that s, values of X larger (smaller) than ts mean tend to be assocated wth values of Y larger (smaller) than ts mean Also mples, on average,, that s, values of X larger (smaller) than ts mean tend to be assocated wth values of Y smaller (larger) than ts mean Therefore, a postve covarance mples a postve relaton between two random varables, and a negatve covarance a negatve relaton The sample covarance s when a random sample conssts of n pars of observatons, for =,,, n Example Fnd the sample covarance between the square footage n thousands (X) and the annual sales n mllons of dollars (Y) 7 37 69 6 39 64 3 8 67 876 4 56 95 53 5 3 34 44 6 56 3 7 3 37 48 8 7 97 9 3 55 76 5 9 435 5 7 5564 46 76 3496 3 58 8 6844 4 3 4 3 49 88 33

C omework: Do the followng problem (You may use MS Excel) Problem Usng the data n the above example, obtan the covarance between the square footage and the annual sales n Euros and compare t wth the covarance obtaned n Example To compute the sample covarance usng MS Excel, do Tools>Data Analyss>Covarance, and enter the range of x- and y- varables For the data n Example we get X Y X 7887 Y 453367 8353878 Note: Unfortunately, ths verson of MS Excel used n (sample sze) as denomnator nstead of n Therefore, we need to multply the numbers by n (n ths example 4) and dvde by n n order to get the correct covarance and varances: X Y X 9798 Y 48739 8996484 There are two Excel functons COVARIANCEP and COVARIANCES for the populaton and the sample covarance, respectvely To make a scatter plot of the above data, hghlght the data and Chart Wzard>XY (Scatter) Then for the above data you wll get 4 Y 8 6 Y 4 3 4 5 6 7 As you wll see n Problem, the magntude of covarance depends on the unts of varables quoted Therefore, the magntude of varable cannot effectvely represent the strength of the relatonshp

The correlaton coeffcent between two random varables X and Y s,, It turns out the correlaton coeffcent s the covarance between the standardzed varable of X and the standardzed varable of Y Snce the standardzed varables are unt free, so s the correlaton coeffcent The sample correlaton coeffcent s Example Usng the data n Example, fnd the sample correlaton coeffcent To compute the sample covarance usng MS Excel do Tools>Data Analyss>Correlaton, and enter the range of x- and y- varables For the above data we get X Y X Y 95883 You may use the Excel functon CORREL C omework: 346 on p 37 and do the followng problem (You may use MS Excel) Problem Usng the data n Example, obtan the correlaton between the square footage and the annual sales n Euros and compare t wth the correlaton obtaned n Example MSL omework: 344 on p 36 (You may use MS Excel) Propertes of the correlaton coeffcent represents a perfect postve lnear assocaton 3 represents a perfect negatve lnear assocaton 4 closer to represents stronger lnear assocaton 5 represents no lnear assocaton, but the varables can have some relaton such as quadratc relaton MSL omework: 3, 3, 33 When two varable are correlated, we are often nterested n fndng the precse relatonshp usng a mathematcal model and n predctng one varable of man nterest, whch s called the dependent varable or response varable and denoted by Y, usng the other varable, whch s called the ndependent varable or predctor varable and denoted by X There are two types of relatonshp consdered n ths chapter Determnstc relatonshp: each value of X s pared wth one and only one value of Y, and Y can be predcted wth certanly for a gven value of X 3

Stochastc (Statstcal) relatonshp: each value of X s assocated wth a whole probablty dstrbuton of values of Y, and Y cannot be predcted wth certanly for a gven value of X, but the knowledge of a value of X helps n predctng Y A smple and popular mathematcal model to descrbe a stochastc relatonshp s the followng Smple Lnear Regresson Model, where Y s the dependent varable and X s the ndependent varable, and the random varable s called the error satsfyng and The purposes of regresson analyss nclude ) to better understand the relatonshp between the dependent varable and the ndependent varable(s) through mathematcal models; and ) to predct the values of the dependent varable wth gven values of the ndependent varables The assumptons about the n turn yeld, that s, the mean of Y s a lnear functon of X and thus the knowledge about the value of X s useful n predctng the mean value of Y (as well as ndvdual value of Y) The lne equaton s called the regresson lne (and n general regresson functon) They also yeld, whch means the varablty of Y s constant regardless of the level of X Fore regresson analyss we have a random sample of n pars of observatons, for =,,n and for these observatons we have, where the are ndependent In the above smple lnear regresson model we have three (unknown) parameters that need to be estmated They are and, whch are called the model parameters, and, whch s the varance of the error Estmaton of these parameters are done by the method of least squares, whch, roughly speakng, fnds a ftted lne that goes through the ponts on a scatter plot such that the lne s as "close" as possble to all the ponts The least squares estmators of and, denoted by and, respectvely are Note that and If we replace the unknown parameters wth ther estmates, we obtan the estmated regresson lne, also called, ftted lne Techncally and are chosen to mnmze, the resdual sum of squares 4

ˆ b b X Y MSL omework: 35 C omework: 39 (Use Excel/PStat) For the -th observaton ( X, Y ), we can obtan the correspondng ftted value Yˆ b b X The dfference between the observed value Y and the ftted value Yˆ, that s, Y Yˆ s called the resdual e Y Yˆ One of the propertes of the resduals s e n The least squares estmator of, denoted by s n ˆ e, n whch s also called the mean squared error : Ths measures the proporton of the total varaton n Y (SST) explaned by the regresson model(ssreg), and s an overall measure of goodness of ft For regresson analyss usng MS Excel do Tools>Data Analyss>Regresson, and enter the range of y- and x- varables If you want confdence ntervals other than 95%, check off Confdence Level and enter the confdence level For the data n Example, we get an output on the next page 5

SUMMARY OUTPUT Regresson Statstcs Multple R 95883 R Square 9479 Adusted R Square 89694 Standard Error 96638 Observatons 4 ANOVA df SS MS F Sgnfcance F Regresson 57476 57476 3335 83E-7 Resdual 668 93389 Total 3 69543 Coeffcents Standard Error t Stat P-value Lower 95% Upper 95% Lower 9% Intercept 964474 5693 8397 977-83 954 664589 X 66986 5695 64 8E-7 37953 7733 39767 MSL omework: 3, 3, 37 C omework: 3 (Use Excel/PStat) ( )% confdence nterval for b t s e( ) /, n b ( )% confdence nterval for b t s e( ) /, n b Testng hypothess about the regresson slope coeffcent : : : : : : b Test statstc: T s e( b ) Reect, f t t t t, n, n t t, n p-value P( T t ) P( T t) P( T t) 6

Testng hypothess about the regresson ntercept coeffcent Replace wth on the prevous page Note the degrees of freedom n regresson analyss s the number of observaton mnus the number of parameters n the model MSL omework: 34, 34, 343, 349 C omework: 347 (Use Excel/PStat) One of the purposes of regresson analyss s to predct the mean value of Y, that s E(Y) gven a value of X, say x A pont estmate of E(Y) gven a value x s Yˆ ( x x) b b X wth standard error ˆ h, where h s called the leverage n ( n ) sx Therefore the ( )% confdence nterval for E(Y) s Yˆ t ˆ, n h Note n ( ) s x SSX Another purpose of regresson analyss s to predct an ndvdual value of Y gven a value of X, say, x Then, a pont estmate s,agan, Yˆ b b X wth standard error ˆ h Therefore ( )% confdence nterval for an ndvdual value of Y, whch s often called the predcton nterval, s Yˆ t ˆ h, n Example 3 Suppose n Example, you want to estmate the mean annual sales of all stores wth the sze of 5 thousand square feet Then t s 9645+66995=934 (mllon dollars) Notng that the sample mean and standard devaton of X-varable are 94 and 78, respectvely, (whch are also computed from MS Excel) we calculate the standard devaton for as 9664 85 46 notng that the leverage s (5 94) 854 And the standard devaton for the ndvdual value s 4 (4 ) (78) 9664 854 5 Snce t 788, to get the 95% confdence nterval for E(Y) at x=5, we compute 5, Yˆ t, n ˆ h 9335 788 46 9335 966 To get the 95% confdence nterval for Y at x=6, we compute Yˆ t ˆ h 9335 788 5 9335 95, n The above 95% confdence nterval for E(Y) at x=5 s nterpreted as Wth 95% confdence the mean annual sales of all stores wth the sz of 5 thousand square feet s between $8469 mllon and $ mllon The above 95% predcton nterval for Y at x=5 s nterpreted as Wth 95% confdence the annual sales of a store wth thte szs of 5 thousand square feet s between $7 mllon and $66 mllon 7

MSL omework: 357 C omework: 3 6 (Use Excel/PStat) When there are more than one ndependent varable are consdered, we have the multple regresson model For example f two ndependent varables, X and W are consdered, the model s Y X W Statstcal nference of the multple regresson model wll be dscussed n MgtOp 4: Statstcal Methods for Management or ECONS 3: Introductory Econometrcs Example: In fnance, t s of nterest to look at the relatonshp between a stock s average return n percent (Y) and the overall market return n percent (X) From randomly selected stocks we obtan the followng data Obs market return (X) stock s return (Y) 37 5 7 3 86 4 6 5 5 34 9 6 39 7 7 8 8 3 4 9 6 3 4 6 3 97 7 73 Fnd the sample correlaton coeffcent between X and Y ow would you decde f a smple lnear regresson model s approprate for the relatonshp between X and Y? 3 If a smple lnear regresson model s ndeed approprate for the relatonshp, fnd the estmated regresson lne 4 Fnd the predcted average return of a stock wth the overall market return of 3% 5 Is there strong evdence that the average return of a stock s lnearly related to the overall market return? Justfy your answer 6 Fnd a 95% confdence nterval for the slope parameter Note n fnance the slope coeffcent s called the stock s beta by nvestment analysts 7 A beta greater than one ndcates that the stock s relatvely senstve to changes n the market, whle a beta less than one ndcates that the stock s relatvely nsenstve For the data analyzed, test f the estmated beta s sgnfcantly greater than one Use =5 8 Fnd an estmate for the varance of the error n the smple lnear regresson model 9 Fnd the 95% confdence nterval for the mean of the average returns of stocks wth the market return of 8% and nterpret the CI Fnd the 95% predcton nterval for the average return of a stock wth the market return of 8% and nterpret the CI Fnd a 9% confdence nterval for the mean of the average returns of stocks wth the market return of 8% Fnd a 9% predcton nterval for the average return of a stock wth the market return of 8% 8