Modeling Panel Data: Choosing the Correct Strategy. Roberto G. Gutierrez

Similar documents
EXST7015: Multiple Regression from Snedecor & Cochran (1967) RAW DATA LISTING

Quantitative Techniques Term 2

Financial Time Series Analysis (FTSA)

The SAS System 11:03 Monday, November 11,

SAS Simple Linear Regression Example

ECG 752: Econometrics II Spring Assessed Computer Assignment 3: Answer Key

An Empirical Examination of Traditional Equity Valuation Models: The case of the Athens Stock Exchange

Advanced Econometrics

The data definition file provided by the authors is reproduced below: Obs: 1500 home sales in Stockton, CA from Oct 1, 1996 to Nov 30, 1998

LAMPIRAN PERHITUNGAN EVIEWS

Lloyds TSB. Derek Hull, John Adam & Alastair Jones

Empirical Methods for Corporate Finance. Panel Data, Fixed Effects, and Standard Errors

Final Exam - section 1. Thursday, December hours, 30 minutes

Are the movements of stocks, bonds, and housing linked? Zachary D Easterling Department of Economics The University of Akron

u panel_lecture . sum

New SAS Procedures for Analysis of Sample Survey Data

Hasil Common Effect Model

Corporate Governance and Banks Performance: An Empirical Study

THE IMPACT OF BANKING RISKS ON THE CAPITAL OF COMMERCIAL BANKS IN LIBYA

9. Assessing the impact of the credit guarantee fund for SMEs in the field of agriculture - The case of Hungary

Docket Revenue Decoupling Mechanism Proposal Responses to Division Data Requests 1-19

1) The Effect of Recent Tax Changes on Taxable Income

Five Things You Should Know About Quantile Regression

Notice that X2 and Y2 are skewed. Taking the SQRT of Y2 reduces the skewness greatly.

Homework 0 Key (not to be handed in) due? Jan. 10

Small Sample Performance of Instrumental Variables Probit Estimators: A Monte Carlo Investigation

Econometrics and Economic Data

CHAPTER 7 MULTIPLE REGRESSION

Time series data: Part 2

is the bandwidth and controls the level of smoothing of the estimator, n is the sample size and

The Simple Regression Model

ARIMA ANALYSIS WITH INTERVENTIONS / OUTLIERS

> attach(grocery) > boxplot(sales~discount, ylab="sales",xlab="discount")

Variance clustering. Two motivations, volatility clustering, and implied volatility

Graduate School of Business, University of Chicago Business 41202, Spring Quarter 2007, Mr. Ruey S. Tsay. Midterm

2SLS HATCO SPSS, STATA and SHAZAM. Example by Eddie Oczkowski. August 2001

The Simple Regression Model

Econ 371 Problem Set #4 Answer Sheet. 6.2 This question asks you to use the results from column (1) in the table on page 213.

Risk Management. Risk: the quantifiable likelihood of loss or less-than-expected returns.

Asian Journal of Empirical Research

The relationship between GDP, labor force and health expenditure in European countries

Assignment #5 Solutions: Chapter 14 Q1.

CHAPTER 5 DATA ANALYSIS OF LINTNER MODEL

Analysis of Dividend Policy Influence Factors of China s Listed Banks

1.1 ANNUAL PRICE MODEL

tm / / / / / / / / / / / / Statistics/Data Analysis User: Klick Project: Limited Dependent Variables{space -6}

Analysis of Variance in Matrix form

GGraph. Males Only. Premium. Experience. GGraph. Gender. 1 0: R 2 Linear = : R 2 Linear = Page 1

1.0 APPENDIX I ECONOMIC MODEL

Lottery Purchases and Taxable Spending: Is There a Substitution Effect?

Openness and Inflation

Your Name (Please print) Did you agree to take the optional portion of the final exam Yes No. Directions

Dummy Variables. 1. Example: Factors Affecting Monthly Earnings

Final Exam, section 1. Thursday, May hour, 30 minutes

ECON Introductory Econometrics. Seminar 4. Stock and Watson Chapter 8

The relationship between external debt and foreign direct investment in D8 member countries ( )

Panel Regression of Out-of-the-Money S&P 500 Index Put Options Prices

Linear Regression with One Regressor

Vivek Raj Case Scenario

CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES

Available online at ScienceDirect. Procedia Economics and Finance 35 ( 2016 )

Let us assume that we are measuring the yield of a crop plant on 5 different plots at 4 different observation times.

Applied Econometrics for Health Economists

Market Approach A. Relationship to Appraisal Principles

UJI COMMON EFFECT MODEL

Basic Regression Analysis with Time Series Data

Non-linearities in Simple Regression

COINTEGRATION AND MARKET EFFICIENCY: AN APPLICATION TO THE CANADIAN TREASURY BILL MARKET. Soo-Bin Park* Carleton University, Ottawa, Canada K1S 5B6

Monetary Economics Risk and Return, Part 2. Gerald P. Dwyer Fall 2015

Implied Volatility v/s Realized Volatility: A Forecasting Dimension

Booth School of Business, University of Chicago Business 41202, Spring Quarter 2013, Mr. Ruey S. Tsay. Midterm

DATABASE AND RESEARCH METHODOLOGY

Stat 328, Summer 2005

Econometric Methods for Valuation Analysis

Lecture 5a: ARCH Models

The Effect of Exchange Rate Risk on Stock Returns in Kenya s Listed Financial Institutions

Brief Sketch of Solutions: Tutorial 2. 2) graphs. 3) unit root tests

The Evidence for Differences in Risk for Fixed vs Mobile Telecoms For the Office of Communications (Ofcom)

DYNAMICS OF URBAN INFORMAL

CHAPTER 4 DATA ANALYSIS Data Hypothesis

Analysis of the Influence of the Annualized Rate of Rentability on the Unit Value of the Net Assets of the Private Administered Pension Fund NN

Effects of Relative Prices and Exchange Rates on Domestic Market Share of U.S. Red-Meat Utilization

LAMPIRAN 1. Retribusi (ribu Rp)

Problem Set 6 ANSWERS

The Least Squares Regression Line

LAMPIRAN. A. Data. PAD (juta) INVESTASI (%) PDRB (juta) Kulon Progo. Bantul. Gunung Kidul. Sleman

Forecasting FTSE Index Using Global Stock Markets

Estimation Procedure for Parametric Survival Distribution Without Covariates

Technical Documentation for Household Demographics Projection

Hierarchical Generalized Linear Models. Measurement Incorporated Hierarchical Linear Models Workshop

Lampiran 1 Lampiran 1 Data Keuangan Bank konvensional

CHAPTER 11 Regression with a Binary Dependent Variable. Kazu Matsuda IBEC PHBU 430 Econometrics

Does Manufacturing Matter for Economic Growth in the Era of Globalization? Online Supplement

University of Zürich, Switzerland

Homework Assignment Section 3

Effect of Education on Wage Earning

Passing the repeal of the carbon tax back to wholesale electricity prices

Two Hours. Mathematical formula books and statistical tables are to be provided THE UNIVERSITY OF MANCHESTER. 22 January :00 16:00

Panel Data with Binary Dependent Variables

9. Logit and Probit Models For Dichotomous Data

Transcription:

Modeling Panel Data: Choosing the Correct Strategy Roberto G. Gutierrez

2 / 25 #analyticsx Overview Panel data are ubiquitous in not only economics, but in all fields Panel data have intrinsic modeling advantages You model panel data in SAS with the PANEL procedure Different model alternatives depending on assumptions and properties Key new features in SAS/ETS 14.1

3 / 25 #analyticsx Panel Data Panel data consist of a set of individuals measured over several points in time Known by other names: longitudinal data, cross-sectional time series, clustered data, multilevel (two-level) data, etc. Data collected in this manner offer key design advantages to modeling The greatest advantage is that the individuals act as their own control group

4 / 25 #analyticsx Panel Regression Model Formally, consider the linear regression model y it D ˇ0 C ˇxX it C ˇzZ i C i C it for i D 1; : : : ; N individuals on t D 1; : : : ; T i time periods. The X variables vary over time The Z variables are constant within individuals The i are individual (or cross section) effects The it are the observation-level errors Different estimation strategies for what you are willing to assume about X, Z, i, and it

5 / 25 #analyticsx Grocery Data Consumer-loyalty data from 330 households who shopped at a grocery chain in Raleigh, North Carolina Monthly expenditures for the year 2011; some monthly data missing Model meat expenditures on the following factors: I I I I I Received government assistance for that month? Household size Rural store location visited during the month? Were at least 10% of total expenditures for alcohol? The number of meals per week outside household, as provided on initial survey Specifically, assess the association with government assistance controlling for the other factors, and for latent household effects

6 / 25 #analyticsx Data Statement data Grocery; input HouseID Month Meat Govt Hsize Rural Alcohol MealsOut; datalines; 1 1 55.841 1 5 0 1 3 1 3 49.372 1 5 0 1 3 1 4 59.43 1 5 0 1 3 1 5 52.25 1 5 0 1 3 1 6 41.623 1 5 0 0 3 1 7 59.357 1 5 0 1 3 1 9 58.512 1 5 0 0 3... 330 9 55.264 1 4 0 0 2 330 10 55.096 1 4 0 0 2 330 12 49.676 1 4 0 0 2 ;

7 / 25 #analyticsx Random-Effects Estimation Model is Meat it D ˇ0 C ˇ1Govt it C ˇ2Hsize i C ˇ3Rural it C ˇ4Alcohol it C ˇ5MealsOut i C i C it Random-effects estimation is the most common strategy Treats the households as a random sample and the i as uncorrelated with X, Z, and it A Hausman test is provided as a referendum on that assumption Also known as generalized least squares (GLS)

8 / 25 #analyticsx Random-Effects Estimation proc panel data = Grocery; id HouseID Month; model Meat = Govt Hsize Rural Alcohol MealsOut / ranone; run; Wansbeek and Kapteyn Variance Components (RanOne) Dependent Variable: Meat (Meat purchases per store visit) Model Description Estimation Method RanOne Number of Cross Sections 330 Time Series Length 12 Fit Statistics SSE 84930.9948 DFE 3567 MSE 23.8102 Root MSE 4.8796 R-Square 0.1232

9 / 25 #analyticsx Random-Effects Estimation Wansbeek and Kapteyn Variance Components (RanOne) Dependent Variable: Meat (Meat purchases per store visit) Variance Component Estimates Variance Component for Cross Sections 190.123 Variance Component for Error 24.99832 Hausman Test for Random Effects Coefficients DF m Value Pr > m 3 3 25.72 <.0001 Parameter Estimates Variable DF Estimate Standard Error t Value Pr > t Label Intercept 1 20.50606 2.3327 8.79 <.0001 Intercept Govt 1 5.050562 0.5989 8.43 <.0001 1 if used government assistance that month Hsize 1 5.145648 0.4774 10.78 <.0001 Household size Rural 1-1.41068 0.3449-4.09 <.0001 1 if rural location visited at least once Alcohol 1 2.982397 0.1960 15.22 <.0001 1 if at least 10% spent on alcohol MealsOut 1-2.82761 0.3848-7.35 <.0001 Meals per week outside of household (survey)

10 / 25 #analyticsx Correlated Individual Effects The Hausman test puts the random-effects results into question The problem is that the individual effects are likely correlated with the explanatory variables This does not happen in designed experiments, but who has that these days? Does the regression coefficient on Govt represent A. The effect of a household going on government assistance; or B. A comparison of two different households, one on government assistance throughout and one not? What do you want it to represent?

Correlated Individual Effects 11 / 25 #analyticsx

12 / 25 #analyticsx Fixed-Effects Estimation Fixed-effects estimation does not assume that individual effects are uncorrelated It produces regression coefficients that are based solely on within-household comparisons Equivalent to inserting a dummy regressor for each household You lose some efficiency from not using any between-household data Regressors are required to vary within households

13 / 25 #analyticsx Fixed-Effects Estimation proc panel data = Grocery; id HouseID Month; model Meat = Govt Hsize Rural Alcohol MealsOut / fixone; run;

14 / 25 #analyticsx Fixed-Effects Estimation Fixed One-Way Estimates Dependent Variable: Meat (Meat purchases per store visit) F Test for No Fixed Effects Num DF Den DF F Value Pr > F 329 3240 32.06 <.0001 Parameter Estimates Variable DF Estimate Standard Error t Value Pr > t Label Intercept 1 53.89442 1.6500 32.66 <.0001 Intercept Govt 1 3.591205 0.6650 5.40 <.0001 1 if used government assistance that month Hsize 0 0... Household size Rural 1-1.45444 0.3578-4.07 <.0001 1 if rural location visited at least once Alcohol 1 2.992035 0.2013 14.86 <.0001 1 if at least 10% spent on alcohol MealsOut 0 0... Meals per week outside of household (survey)

15 / 25 #analyticsx Between-Groups Estimation Rarely useful, put provided for comparison proc panel data = Grocery; id HouseID Month; model Meat = Govt Hsize Rural Alcohol MealsOut / btwng; run;

16 / 25 #analyticsx Between-Groups Estimation Between-Groups Estimates Dependent Variable: Meat (Meat purchases per store visit) Parameter Estimates Variable DF Estimate Standard Error t Value Pr > t Label Intercept 1 16.98442 1.7004 9.99 <.0001 Intercept Govt 1 13.40059 0.9886 13.56 <.0001 1 if used government assistance that month Hsize 1 5.092447 0.3032 16.80 <.0001 Household size Rural 1 0.005439 1.4038 0.00 0.9969 1 if rural location visited at least once Alcohol 1 1.082457 1.7681 0.61 0.5408 1 if at least 10% spent on alcohol MealsOut 1-2.67669 0.2629-10.18 <.0001 Meals per week outside of household (survey)

17 / 25 #analyticsx Hausman-Taylor Estimation Random effects: All regressors uncorrelated with i Fixed effects: They might all be correlated Hausman-Taylor: Why not stipulate some regressors as correlated, and have the best of both worlds? Choose correlated variables based on substantive knowledge, or guess; there s a test for that Estimation is done using instrumental variables, determined internally This is a new feature of SAS/ETS 14.1

18 / 25 #analyticsx Hausman-Taylor Estimation proc panel data = Grocery; id HouseID Month; instruments correlated = (Govt Mealsout); model Meat = Govt Hsize Rural Alcohol MealsOut / htaylor; run;

19 / 25 #analyticsx Hausman-Taylor Estimation Hausman and Taylor Model for Correlated Individual Effects (HTaylor) Dependent Variable: Meat (Meat purchases per store visit) Variance Component Estimates Variance Component for Cross Sections 97.29627 Variance Component for Error 24.97519 Hausman Test against Fixed Effects Coefficients DF m Value Pr > m 3 1 0.76 0.3824 Parameter Estimates Variable Type DF Estimate Standard Error t Value Pr > t Label Intercept 1 19.12589 2.4038 7.96 <.0001 Intercept Govt C 1 3.583391 0.6649 5.39 <.0001 1 if used government assistance that month Hsize TI 1 5.17389 0.3523 14.68 <.0001 Household size Rural 1-1.43991 0.3573-4.03 <.0001 1 if rural location visited at least once Alcohol 1 2.974996 0.2004 14.85 <.0001 1 if at least 10% spent on alcohol MealsOut C TI 1-1.92242 0.8090-2.38 0.0175 Meals per week outside of household (survey) C: correlated with the individual effects TI: constant (time-invariant) within cross sections

20 / 25 #analyticsx The COMPARE Statement The COMPARE statement is another new feature of PROC PANEL in SAS/ETS 14.1 Makes it easy to compare various models and estimators side by side proc panel data = Grocery; id HouseID Month; instruments correlated = (Govt Mealsout); model Meat = Govt Hsize Rural Alcohol MealsOut / ranone fixone btwng htaylor; compare; run;

21 / 25 #analyticsx The COMPARE Statement Model Comparison Dependent Variable: Meat (Meat purchases per store visit) Comparison of Model Parameter Estimates Variable Model 1 FixOne Model 1 RanOne Model 1 HTaylor Model 1 BtwGrps Intercept Estimate Std Err 53.894415 1.649992 20.506060 2.332669 19.125895 2.403772 16.984418 1.700415 Govt Estimate Std Err 3.591205 0.665025 5.050562 0.598942 3.583391 0.664876 13.400587 0.988573 Hsize Estimate Std Err 0. 5.145648 0.477447 5.173890 0.352344 5.092447 0.303155 Rural Estimate Std Err -1.454439 0.357766-1.410680 0.344892-1.439905 0.357340 0.005439 1.403805 Alcohol Estimate Std Err 2.992035 0.201343 2.982397 0.196014 2.974996 0.200391 1.082457 1.768137 MealsOut Estimate Std Err 0. -2.827608 0.384842-1.922421 0.808967-2.676694 0.262948

22 / 25 #analyticsx Other Capabilities PROC PANEL can also do much more: Two-way models Dynamic-panel models Adjustments for serial correlation, heteroscedasticity, and clustering Unit root tests Model specification tests (e.g. Durbin-Watson)

23 / 25 #analyticsx Summary Panel data offer modeling advantages Use PROC PANEL in SAS/ETS for panel data regression Many estimators available depending on assumptions Correlated individual effects can be problematic, but there are solutions New features in SAS/ETS 14.1