Example: Small-Sample Properties of IV and OLS Estimators

Similar documents
STRESS-STRENGTH RELIABILITY ESTIMATION

Statistics 431 Spring 2007 P. Shaman. Preliminaries

A Test of the Normality Assumption in the Ordered Probit Model *

Small Sample Performance of Instrumental Variables Probit Estimators: A Monte Carlo Investigation

Basics. STAT:5400 Computing in Statistics Simulation studies in statistics Lecture 9 September 21, 2016

The normal distribution is a theoretical model derived mathematically and not empirically.

Online Appendix: Revisiting the German Wage Structure

Random Variables and Probability Distributions

STAT 509: Statistics for Engineers Dr. Dewei Wang. Copyright 2014 John Wiley & Sons, Inc. All rights reserved.

ECON 214 Elements of Statistics for Economists 2016/2017

ECON 214 Elements of Statistics for Economists

LESSON 7 INTERVAL ESTIMATION SAMIE L.S. LY

UNIVERSITY OF VICTORIA Midterm June 2014 Solutions

Class 11. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Data Analysis and Statistical Methods Statistics 651

Diversification and Yield Enhancement with Hedge Funds

Financial Econometrics Jeffrey R. Russell. Midterm 2014 Suggested Solutions. TA: B. B. Deng

Online Appendix of. This appendix complements the evidence shown in the text. 1. Simulations

Operational Risk Quantification and Insurance

Bias Reduction Using the Bootstrap

Financial Risk Forecasting Chapter 9 Extreme Value Theory

Journal of Economics and Financial Analysis, Vol:1, No:1 (2017) 1-13

Modeling Credit Risk of Loan Portfolios in the Presence of Autocorrelation (Part 2)

Much of what appears here comes from ideas presented in the book:

2.4 STATISTICAL FOUNDATIONS

CH 5 Normal Probability Distributions Properties of the Normal Distribution

Math 416/516: Stochastic Simulation

The data definition file provided by the authors is reproduced below: Obs: 1500 home sales in Stockton, CA from Oct 1, 1996 to Nov 30, 1998

Heterogeneity in Returns to Wealth and the Measurement of Wealth Inequality 1

Presented at the 2012 SCEA/ISPA Joint Annual Conference and Training Workshop -

Market Risk Analysis Volume I

The Economic and Social BOOTSTRAPPING Review, Vol. 31, No. THE 4, R/S October, STATISTIC 2000, pp

value BE.104 Spring Biostatistics: Distribution and the Mean J. L. Sherley

MAKING SENSE OF DATA Essentials series

Amath 546/Econ 589 Univariate GARCH Models: Advanced Topics

Chapter 5 Univariate time-series analysis. () Chapter 5 Univariate time-series analysis 1 / 29

We use probability distributions to represent the distribution of a discrete random variable.

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

A COMPARATIVE ANALYSIS OF REAL AND PREDICTED INFLATION CONVERGENCE IN CEE COUNTRIES DURING THE ECONOMIC CRISIS

Statistical Modeling Techniques for Reserve Ranges: A Simulation Approach

14.1 Moments of a Distribution: Mean, Variance, Skewness, and So Forth. 604 Chapter 14. Statistical Description of Data

Consistent estimators for multilevel generalised linear models using an iterated bootstrap

Random Tree Method. Monte Carlo Methods in Financial Engineering

Chapter 4. The Normal Distribution

Business Statistics 41000: Probability 4

The Normal Probability Distribution

Chapter 8: Sampling distributions of estimators Sections

The Fundamentals of Reserve Variability: From Methods to Models Central States Actuarial Forum August 26-27, 2010

Econ 300: Quantitative Methods in Economics. 11th Class 10/19/09

Chapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi

Mean-Variance Portfolio Theory

Point Estimation. Edwin Leuven

Basic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 4: May 2, Abstract

Unit 5: Sampling Distributions of Statistics

Unit 5: Sampling Distributions of Statistics

discussion Papers Some Flexible Parametric Models for Partially Adaptive Estimators of Econometric Models

No, because np = 100(0.02) = 2. The value of np must be greater than or equal to 5 to use the normal approximation.

Fast Convergence of Regress-later Series Estimators

Tutorial 11: Limit Theorems. Baoxiang Wang & Yihan Zhang bxwang, April 10, 2017

STAT Chapter 6: Sampling Distributions

Statistics & Flood Frequency Chapter 3. Dr. Philip B. Bedient

Lecture 2 INTERVAL ESTIMATION II

AP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE

PRMIA Exam 8002 PRM Certification - Exam II: Mathematical Foundations of Risk Measurement Version: 6.0 [ Total Questions: 132 ]

PROBABILITY ODDS LAWS OF CHANCE DEGREES OF BELIEF:

Session Window. Variable Name Row. Worksheet Window. Double click on MINITAB icon. You will see a split screen: Getting Started with MINITAB

HIGHER SECONDARY I ST YEAR STATISTICS MODEL QUESTION PAPER

Sampling Distributions and the Central Limit Theorem

Math 227 Elementary Statistics. Bluman 5 th edition

Examples of continuous probability distributions: The normal and standard normal

Lecture 12. Some Useful Continuous Distributions. The most important continuous probability distribution in entire field of statistics.

Monte Carlo Methods. Prof. Mike Giles. Oxford University Mathematical Institute. Lecture 1 p. 1.

STA258H5. Al Nosedal and Alison Weir. Winter Al Nosedal and Alison Weir STA258H5 Winter / 41

Lecture Data Science

Back to estimators...

Section The Sampling Distribution of a Sample Mean

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

KARACHI UNIVERSITY BUSINESS SCHOOL UNIVERSITY OF KARACHI BS (BBA) VI

A New Multivariate Kurtosis and Its Asymptotic Distribution

The Normal Distribution & Descriptive Statistics. Kin 304W Week 2: Jan 15, 2012

Financial Econometrics Notes. Kevin Sheppard University of Oxford

Subject CS2A Risk Modelling and Survival Analysis Core Principles

[D7] PROBABILITY DISTRIBUTION OF OUTSTANDING LIABILITY FROM INDIVIDUAL PAYMENTS DATA Contributed by T S Wright

Phd Program in Transportation. Transport Demand Modeling. Session 11

Module 4: Monte Carlo path simulation

Module 3: Sampling Distributions and the CLT Statistics (OA3102)

Volatility Lessons Eugene F. Fama a and Kenneth R. French b, Stock returns are volatile. For July 1963 to December 2016 (henceforth ) the

ARCH Models and Financial Applications

Section Introduction to Normal Distributions

MATH 10 INTRODUCTORY STATISTICS

MATH 3200 Exam 3 Dr. Syring

Financial Econometrics (FinMetrics04) Time-series Statistics Concepts Exploratory Data Analysis Testing for Normality Empirical VaR

CHAPTER 8 PROBABILITY DISTRIBUTIONS AND STATISTICS

Statistics and Finance

STAT Chapter 5: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.

1.017/1.010 Class 19 Analysis of Variance

Financial Econometrics

BIOS 4120: Introduction to Biostatistics Breheny. Lab #7. I. Binomial Distribution. RCode: dbinom(x, size, prob) binom.test(x, n, p = 0.

The probability of having a very tall person in our sample. We look to see how this random variable is distributed.

P2.T8. Risk Management & Investment Management. Jorion, Value at Risk: The New Benchmark for Managing Financial Risk, 3rd Edition.

Transcription:

Example: Small- Properties of IV and OLS Estimators Considerable technical analysis is required to characterize the finite-sample distributions of IV estimators analytically. However, simple numerical examples provide a picture of the situation. Consider a regression y = x$ + g where there is a single right-hand-side variable, and a single instrument w, and assume x, w, and g have the simple joint distribution given in the table below, where 8 is the correlation of x and w, D is the correlation of x and g, and 8 + D < 1. The interpretation of the second row of the table, for example, is that (x,w,,) = (1,1,-1) and (x,w,,) = (-1,-1,1) each occur with probability (1-D+8)/8: x w g Prob ±1 ±1 ±1 (1+D+8)/8 ±1 ±1 K1 (1-D+8)/8 ±1 K1 ±1 (1+D-8)/8 ±1 K1 K1 (1-D-8)/8 The random variables (x,w,,) have mean zero, variance one, and Exg = D, Exw = 8, and Ewg = 0. Their products have the joint distribution xw xg wg Prob 1 1 1 (1+D+8)/4 1-1 -1 (1-D+8)/4-1 1-1 (1+D-8)/4-1 -1 1 (1-D-8)/4 This implies P(xg=1) = (1+D)/2. Then, in a sample of size n, n((b OLS - $) + 1)/2 has an exact distribution that is binomial with n draws and probability (1+D)/2. Then n 1/2 (b OLS - $) has mean n 1/2 D and variance (1-D 2 ). Thus, n@mse = n@(variance + Bias 2 ) = 1 + (n-1)d 2. The asymptotic theory for the IV estimator establishes that n 1/2 (b IV - $) is approximately normal with mean zero and n@mse = 1/8 2., equal to the asymptotic variance Ew 2 /(Exw) 2 This suggests that the larger n, D, and 8, the more likely that IV will be better than OLS. We compare b OLS and b IV for samples of various sizes drawn from the distribution above, for different values of D and 8. The following tables summarize the results of 1000 replications of each sample. In these tables, Bias is the mean (in 1000 samples) of n 1/2 (b OLS - $) or n 1/2 (b IV - $), while MSE is the mean (in 1000 samples) of n(b OLS - $) 2 or n(b IV - $) 2, where these moments for b IV are calculated conditioned on the event that b IV exists. The IV column gives the proportion of the replications where b IV exists; this is always less than one for this data generation process when n is even, but it converges toward one rapidly, so that for sample sizes above 40, it is negligible. The IV column gives the proportion of replications where b IV is closer than b OLS to the true $. Because of the thick tail for values of biv, the sample sizes where IV exceeds 50 are smaller than the sample sizes where the sample expectation of MSE for IV (conditioned on IV existing) is less than that for OLS. The final columns of the table give some percentiles of the CDF s of n 1/2 (b OLS -$) and n 1/2 (b IV -$). One expects that this expression for OLS will drift due to the effect of bias, whereas the corresponding expression for b IV will be approximately stationary. The results demonstrate the relatively thick tails of the expression for b IV. 1

Mild Contamination, Moderately Good Instrument: D = 0.2, 8 = 0.5 10 0.65 0.10 93.9 1.32 6.81 26.4 0 36 96 17 59 86 20 0.90-0.10 98.9 1.79 8.50 34.1 0 24 94 14 58 88 30 1.11-0.17 99.8 2.16 8.24 42.0 0 17 90 13 56 90 40 1.30-0.11 100 2.65 6.42 45.5 0 13 86 12 55 89 60 1.55-0.12 100 3.35 5.22 54.9 0 7 82 12 52 92 80 1.79-0.13 100 4.15 7.45 61.3 0 5 77 13 54 90 100 1.98-0.12 100 4.91 5.11 64.4 0 2 70 13 54 90 150 2.45-0.08 100 6.92 4.33 76.2 0 1 53 10 52 91 200 2.84-0.07 100 9.03 4.12 81.6 0 0 36 12 54 90 250 3.16-0.08 100 11.0 4.01 85.9 0 0 24 11 53 91 300 3.47-0.06 100 13.0 4.19 88.2 0 0 16 13 51 90 Existence of the IV estimator is a problem only for sample sizes under 40. IV is better a majority of the time for sample sizes above 40. Because IV has large deviations, its MSE is large even when one conditions on the existence of the IV estimator, so that in terms of this criterion, IV is better only for sample sizes over 100. The distribution of the OLS estimator is strongly shifted to the right, and increasingly so with sample size, due to the bias. The distribution of the IV estimator is roughly symmetric, with thick tails. 2

Severe Contamination, Moderately Good Instrument: D = 0.5, 8 = 0.4 10 1.57 0.18 91.0 3.19 9.68 43.1 0 8 77 24 57 85 20 2.21-0.49 97.4 5.60 17.4 59.8 0 1 61 21 56 88 30 2.71-0.87 98.6 8.08 23.8 66.8 0 0 34 22 56 88 40 3.15-1.01 99.6 10.6 29.2 73.2 0 0 19 23 56 86 60 3.85-0.81 99.8 15.6 22.4 81.8 0 0 6 21 56 87 80 4.46-0.63 100 20.6 11.3 86.1 0 0 2 21 55 89 100 4.99-0.50 100 25.7 9.63 91.1 0 0 0 22 54 88 150 6.12-0.39 100 38.1 8.04 94.9 0 0 0 20 55 87 200 7.07-0.31 100 50.7 7.43 96.9 0 0 0 19 54 86 250 7.91-0.28 100 63.3 7.37 98.2 0 0 0 18 53 85 300 8.68-0.17 100 76.1 7.06 99.1 0 0 0 18 52 85 Existence of the IV estimator is an issue for sample sizes below 40. The IV estimator is better a majority of the time for sample sizes of 20 and higher. In terms of MSE, IV is better for sample sizes over 60. 3

Mild Contamination, Weak Instrument: D = 0.2, 8 = 0.2 10 0.59 0.27 78.3 1.33 12.7 18.5 0 38 95 30 61 73 20 0.88 0.42 86.9 1.71 31.1 18.5 0 25 95 28 55 73 30 1.09-0.07 91.7 2.11 52.9 21.8 0 19 91 32 56 71 40 1.25-0.04 94.3 2.54 80.9 20.2 0 14 86 33 57 70 60 1.52-0.08 96.9 3.31 101 23.9 0 8 83 32 56 68 80 1.74-0.33 98.0 4.01 105 26.7 0 5 79 32 54 68 100 1.95-0.48 98.3 4.76 121 28.8 0 3 71 34 55 71 150 2.40-0.77 99.9 6.72 123 36.4 0 1 54 32 54 71 200 2.82-0.82 100 8.90 71 40.5 0 0 38 32 54 71 250 3.16-0.77 100 11.0 112 45.1 0 0 25 33 54 70 300 3.46-0.43 100 13.0 36.7 49.0 0 0 16 33 54 69 Existence of the IV estimator is a substantial problem for sample sizes below 80. IV is never better in a majority of cases, even at a sample size of 300. 4

Severe Contamination, Weak Instrument: D = 0.5, 8 = 0.2 10 1.57 1.16 81.6 3.21 13.4 29.9 0 7 77 25 50 67 20 2.26 0.94 89.6 5.85 29.7 36.6 0 1 59 23 48 66 30 2.79 0.79 93.1 8.50 48.0 41.6 0 0 30 26 47 64 40 3.21 0.28 95.9 11.0 65.5 47.3 0 0 16 27 48 66 60 3.91-0.37 96.8 16.0 101 54.5 0 0 5 30 50 67 80 4.49-0.70 98.5 20.8 141 61.4 0 0 1 30 50 69 100 5.03-0.82 99.2 26.0 101 69.1 0 0 0 30 50 71 150 6.13-1.37 99.7 38.3 105 76.5 0 0 0 30 53 71 200 7.06-1.29 99.9 50.6 80.0 81.5 0 0 0 31 52 70 250 7.90-1.19 100 63.1 72.4 84.6 0 0 0 31 53 70 300 8.66-0.93 100 75.7 47.5 88.6 0 0 0 31 53 71 Existence of the IV estimator is a substantial problem for sample sizes below 80. The IV estimator is better in a majority of cases for sample sizes above 40. In terms of MSE, IV is better for sample sizes above 250. Econ. 240B, Fall 2002, Dan McFadden 5