Errors in Survey Reporting and Imputation and their Effects on Estimates of Food Stamp Program Participation

Similar documents
Wage Gap Estimation with Proxies and Nonresponse

The Under-Reporting of Transfers in Household Surveys: Its Nature and Consequences. Bruce D. Meyer, Wallace K.C. Mok and James X. Sullivan* June 2015

NBER WORKING PAPER SERIES THE UNDER-REPORTING OF TRANSFERS IN HOUSEHOLD SURVEYS: ITS NATURE AND CONSEQUENCES

Large and nationally representative surveys are arguably among the most

LIHEAP Targeting Performance Measurement Statistics:

Do Older Americans Have More Income Than We Think?

How Well are Earnings Measured in the Current Population Survey? Bias from Nonresponse and Proxy Respondents*

Differences in Estimates of Food Stamp Program Participation Between Surveys and Administrative Records

Wage Gap Estimation with Proxies and Nonresponse *

Health Status, Health Insurance, and Health Services Utilization: 2001

Underreporting of Means-Tested Transfer Programs in the CPS and SIPP Laura Wheaton The Urban Institute

The dynamics of health insurance coverage: identifying trigger events for insurance loss and gain

Random Group Variance Adjustments When Hot Deck Imputation Is Used to Compensate for Nonresponse 1

Trouble in the Tails? Earnings Non-Response and Response Bias across the Distribution

Poverty Facts, million people or 12.6 percent of the U.S. population had family incomes below the federal poverty threshold in 2004.

Technical Documentation: Generating Unbanked and Underbanked Estimates for Local Geographies

Poverty in the United States in 2014: In Brief

Trends in Supplemental Nutrition Assistance Program Participation Rates: Fiscal Year 2010 to Fiscal Year 2014

Trouble in the Tails? Earnings Nonresponse and Response Bias across the Distribution Using Matched Household and Administrative Data

Estimating the Impacts of Program Benefits: Using Instrumental Variables with. Underreported and Imputed Data

S E P T E M B E R Comparing Federal Government Surveys that Count Uninsured People in America

Poverty in the United Way Service Area

Figure 1 Nearly 1 million Virginians lack health insurance coverage. Total Nonelderly

Trouble in the Tails? What We Know about Earnings Nonresponse Thirty Years after Lillard, Smith, and Welch

NBER WORKING PAPER SERIES MEASURING THE WELL-BEING OF THE POOR USING INCOME AND CONSUMPTION. Bruce D. Meyer James X. Sullivan

Do Imputed Earnings Earn Their Keep? Evaluating SIPP Earnings and Nonresponse with Administrative Records

The Trend in Lifetime Earnings Inequality and Its Impact on the Distribution of Retirement Income. Barry Bosworth* Gary Burtless Claudia Sahm

Anomalies under Jackknife Variance Estimation Incorporating Rao-Shao Adjustment in the Medical Expenditure Panel Survey - Insurance Component 1

The Under-Reporting of Transfers in Household Surveys: Its Nature and Consequences. Bruce D. Meyer, Wallace K.C. Mok and James X.

Most Workers in Low-Wage Labor Market Work Substantial Hours, in Volatile Jobs

It Don t Come Easy, Ringo Starr

DOCUMENTATION ON THE URBAN INSTITUTE S AMERICAN COMMUNITY SURVEY-HEALTH INSURANCE POLICY SIMULATION MODEL (ACS-HIPSM)

CURRENT POPULATION SURVEY ANALYSIS OF NSLP PARTICIPATION and INCOME

Trends in Supplemental Nutrition Assistance Program Participation Rates: Fiscal Year 2010 to Fiscal Year 2013

Demographic and Economic Characteristics of Children in Families Receiving Social Security

Heterogeneity in the Impact of Economic Cycles and the Great Recession: Effects Within and Across the Income Distribution

No K. Swartz The Urban Institute

Measuring the Well-Being of the Poor Using Income and Consumption

Many studies have documented the long term trend of. Income Mobility in the United States: New Evidence from Income Tax Data. Forum on Income Mobility

CHAPTER 2 PROJECTIONS OF EARNINGS AND PREVALENCE OF DISABILITY ENTITLEMENT

THE MEASUREMENT OF MEDICAID COVERAGE IN THE SIPP: EVIDENCE FROM CALIFORNIA, David Card Andrew K. G. Hildreth Lara D.

Small Area Health Insurance Estimates from the Census Bureau: 2008 and 2009

STRATEGIES FOR THE ANALYSIS OF IMPUTED DATA IN A SAMPLE SURVEY

The Role of CPS Nonresponse on the Level and Trend in Poverty

LECTURE: MEDICAID HILARY HOYNES UC DAVIS EC230 OUTLINE OF LECTURE: 1. Overview of Medicaid. 2. Medicaid expansions

Health Insurance Coverage in 2014: Significant Progress, but Gaps Remain

ONLINE APPENDIX. The Vulnerability of Minority Homeowners in the Housing Boom and Bust. Patrick Bayer Fernando Ferreira Stephen L Ross

Child poverty in rural America

Comparison of Income Items from the CPS and ACS

Assessing the reliability of regression-based estimates of risk

Supplementary Appendix

The Role of CPS Non-Response on Trends in Poverty and Inequality

Measuring Levels and Trends in Earnings Inequality with Nonresponse, Imputations, and Topcoding

Income Data for 2002: A Comparison of Eight Surveys

Health Insurance Coverage in Massachusetts: Results from the Massachusetts Health Insurance Surveys

Online appendix for W. Kip Viscusi, Joel Huber, and Jason Bell, Assessing Whether There Is a Cancer Premium for the Value of a Statistical Life

Employment Equity in Southern States: Detailed Methodology

NBER WORKING PAPER SERIES THE GROWTH IN SOCIAL SECURITY BENEFITS AMONG THE RETIREMENT AGE POPULATION FROM INCREASES IN THE CAP ON COVERED EARNINGS

The Department of Commerce will submit to the Office of Management and

The coverage of young children in demographic surveys

While real incomes in the lower and middle portions of the U.S. income distribution have

WHO ARE THE UNINSURED IN RHODE ISLAND?

Do Older Americans Have More Income Than We Think?

The Employment, Earnings, and Income of Single Mothers in Wisconsin Who Left Cash Assistance: Comparisons among Three Cohorts. Daniel R.

Aaron Sojourner & Jose Pacas December Abstract:

Transition Events in the Dynamics of Poverty

THE AP-GfK POLL March, 2014

Household Income Trends March Issued April Gordon Green and John Coder Sentier Research, LLC

THE Current Population Survey (CPS) is used extensively

The model is estimated including a fixed effect for each family (u i ). The estimated model was:

In 2012, according to the U.S. Census Bureau, about. A Profile of the Working Poor, Highlights CONTENTS U.S. BUREAU OF LABOR STATISTICS

Estimates imply that only one-third of elderly persons who are eligible for food stamps

TRENDS IN FSP PARTICIPATION RATES: FOCUS ON SEPTEMBER 1997

HEALTH INSURANCE COVERAGE IN MAINE

FOOD STAMPS, TEMPORARY ASSISTANCE FOR NEEDY FAMILIES AND FOOD HARDSHIPS IN THREE AMERICAN CITIES

THE NATIONAL income and product accounts

Living Arrangements, Doubling Up, and the Great Recession: Was This Time Different?

Your Community Health Center If you need help filling out this form, please let us know. PATIENT REGISTRATION FORM (Please Print)

Wage Gap Estimation with Proxies and Nonresponse *

Health Insurance Coverage: 2001

AN IMPORTANT POLICY ISSUE IS HOW TAX

THE AP-GfK POLL December, 2013

ISSUE BRIEF. poverty threshold ($18,769) and deep poverty if their income falls below 50 percent of the poverty threshold ($9,385).

COMMUNITY ADVANTAGE PANEL SURVEY: DATA COLLECTION UPDATE AND ANALYSIS OF PANEL ATTRITION

Trouble in the Tails? What We Know about Earnings Nonresponse Thirty Years after Lillard, Smith, and Welch

Medicaid Undercount in the American Community Survey: Preliminary Results

PERCEPTIONS OF EXTREME WEATHER AND CLIMATE CHANGE IN VIRGINIA

PATIENT REGISTRATION FORM

Income Inequality, Mobility and Turnover at the Top in the U.S., Gerald Auten Geoffrey Gee And Nicholas Turner

Health Insurance Coverage in 2013: Gains in Public Coverage Continue to Offset Loss of Private Insurance

In the coming months Congress will consider a number of proposals for

March Karen Cunnyngham Amang Sukasih Laura Castner

california C A LIFORNIA HEALTHCARE FOUNDATION Health Care Almanac California Employer Health Benefits Survey

Changes in the Experience-Earnings Pro le: Robustness

STAB22 section 2.2. Figure 1: Plot of deforestation vs. price

COMMUNITY ADVANTAGE PANEL SURVEY: DATA COLLECTION UPDATE AND ANALYSIS OF PANEL ATTRITION

Heterogeneity in Returns to Wealth and the Measurement of Wealth Inequality 1

We use data from the Survey of Income and Program Participation (SIPP) to investigate the impact that

FIGURE I.1 / Per Capita Gross Domestic Product and Unemployment Rates. Year

The Business Cycle's Secondary Effects on the Decision to Participate in the Food Stamps Program

Transcription:

Errors in Survey Reporting and Imputation and their Effects on Estimates of Food Stamp Program Participation ITSEW June 3, 2013 Bruce D. Meyer, University of Chicago and NBER Robert Goerge, Chapin Hall Nikolas Mittag, University of Chicago Supported by Census and ERS of USDA.

Declining Quality of Survey Data n Both unit and item nonresponse have been rising in most surveys n The error in responses conditional on obtaining one (measurement error) has risen n Both have been shown to bias common analyses and violate the assumptions of corrections (CME and MAR) n We don t really know why: declining public spirit, people are over-surveyed? n These patterns have implications for much that is done by the statistical agencies, other government agencies and outside empirical researchers and for public policy.

Past Work: Food Stamps and Underreporting n Food Stamps/SNAP expanding rapidly. n Aggregate data indicates high rates of underreporting n Most previous studies of the impact of the FSP on poverty, inequality, etc. have not addressed underreporting. Jolliffe et al. (2005) a partial exception. n Some work that incorporates underreporting in takeup analyses. Bollinger and David (1997, 2001).

Matched Microdata Analyes n We match administrative microdata for food stamps in IL and MD to ACS, CPS and SIPP. n The approach eliminates some worries about aggregate comparisons unit nonresponse bias that weighting doesn t solve universe differences Overreporting may offset some underreporting n Allows us to examine how microdata analyses using program receipt might be biased. n Can examine how errors vary by observables & reasons for errors (haven t focused on this yet)

Administrative Food Stamp Data n IL and MD food stamp data. n Contains monthly indicators of receipt n Data matched using a Protected Identification Key or PIK (transformation of SSN). n Food stamp data are supposed to have verified SSNs for all those in assistance unit. The SSNs are converted to PIKs for 96.4 percent of all records in IL, 97.8 percent in MD.

ACS Data n 2001 SS01 (ACS). n Census Bureau uses name, address, DOB to create PIKs. Successful for at least one member of 92.7 percent of households in IL, 94.9 percent of households in MD. n PIKs not missing at random. We multiply the survey weights by the inverse of the probability of having a PIK

CPS Data n CPS ASEC 2002-5 in IL 2002-4 in MD n PIK rates lower in CPS than ACS; 68 percent for IL, 81 percent for MD n PIKs not missing at random. Survey weights multiplied by inverse of probability of having a PIK

SIPP Data n SIPP 2001 Panel Late 2001-2003 IL Late 2001-2003 in MD n SIPP 2004 Panel 2004 and part of 2005 in IL n PIK rates low in 2001 panel, then rise in 2004 panel when survey moves to passive consent. n PIKs not missing at random. Survey weights multiplied by inverse of probability of having a PIK

Implications of transfer misreporting n Misreporting has important effects n If transfers are under-reported as aggregate data suggest: the income distribution appears worse, the effects of transfers in improving the distribution is understated, program takeup is biased downward, and analyses of other program effects are biased. Here, we will see that the determinants of program receipt are biased.

Reporting Errors n ACS False Negative rates: 32% in IL, 37% in MD. False Positive rates: 0.8% in IL, 0.5% in MD. Net understatement: 23% in IL, 29% in MD. n CPS False Negative rates: 48% in IL, 53% in MD. False Positive rates: 1.0% in IL, 0.4% in MD. Net understatement: 39% in IL, 46% in MD. IL error rates higher in last year, MD much higher in last year

Reporting Errors n SIPP False Negative rate: 23% in combined IL and MD data False Positive rate: 1.6% in combined data Small net overstatement of food stamp receipt? n For each of the surveys, the samples include imputed observations. Informative on biases in substantive analyses that usually use imputed data, but maybe not best sample to determine reasons for mis-reporting.

Results: error determinants in ACS n Probits conditioning on administrative receipt status n False negatives more common for older households non-whites higher income households those with fewer FSP months received those without reported PA receipt. the urban those not imputed in IL if male and more educated in MD if unemployed n Many other explanatory variables examined: language, CATI, CAPI, etc. n False positives also vary with characteristics

Results: participation determinants. n Two approaches: Just survey data (standard approach). Survey and administrative data combined. Use administrative dependent variable. n We estimate probit models using the two approaches. n We compare the coefficients and the average derivatives estimated from the two approaches.

Does it matter? n Test statistics always reject that the survey data and the combined data give you the same answer. n A more important question is whether the results are substantively different.

ACS substantive differences n If you follow the standard approach and use only survey data, you would sharply understate participation by single parents and non-whites in both states, older households, native speakers, and those with small families in IL, and those with low incomes in MD. n In the ACS you would also get the patterns of multiple program participation wrong, but the errors are multidimensional and differ across states.

Half Empty or Half Full: Half Full n One might wonder if the ACS and CPS provide useful information on food stamps given false negatives of one-third or one-half. Even with lower error rates in the SIPP you might worry. n The information that one learns about receipt though is very similar using the administrative data. Almost all signs are unchanged, and statistical significance is mostly the same. n This result is likely to be analysis specific. n Other analyses affected more severely, e.g. find substantive differences in analyses of distributional consequences using data from NY.

Imputation Summary n Imputation rates are high and rising. They are typically over 20 percent, but are often quite a bit higher for certain years and surveys. n Dollar and month imputation are similar. n Recipiency (yes/no) often imputed, generally responsible for about 10 percent of dollars. n Imputation higher in SIPP (less true for FSP months) than CPS, so narrows the data quality difference between them somewhat.

Nonresponse: Conditionally Random? n Test missing conditionally at random using the SIPP n First, estimate take-up model for respondents and nonrespondents separately using administrative receipt n χ 2 - test rejects coefficient equality (p-value=0.00001) n Predict probability of receipt for non-respondents using take-up model of respondents n Regress actual receipt on the predicted probability for non-respondents n If MAR holds (for the administrative measure) this should yield a 45-degree line

Nonresponse: Conditionally Random?

Should one use the imputed observations? n Many researchers drop imputed observations. Should they? n We have a measure of truth so we should be able to answer this question. n Are we closer to the combined participation estimates when the survey only estimates use or do not use the imputed observations?

Results on imputations n Construct chi-square stat for the difference between the combined (admin dependent variable) and survey only estimates. n For the ACS better off including the imputed observations. n For the CPS, not very different when include or exclude them. n For the SIPP, much better off including the imputed observations.

Conclusions n n n n n n n Error rates high for food stamp reports in surveys. Errors matter for estimates of the determinants of program participation, but maybe not as much as might have thought. Survey and administrative data can be usefully combined. Results supportive of aggregate analyses of errors being meaningful Mixed support for assumptions of some correction methods that do not rely on matched data. Matched data can be used to examine imputations. Underlying assumptions violated Imputed values improve estimates in some cases. Because survey data are important for so many policy questions, the general issue of declining survey quality has many implications that we have just begun to work on.

Imperfect Linking and Biases n n Partly PIKed households (14% in ACS, approx. 20% in CPS); state movers follow same argument. Let the 2 x 2 matrix of row probabilities be: Survey Admin p 00 p 01 n p 10 p 11 Row probabilities sum to 1; 0= don t receive, 1=receive. Let p 1 be the probability of reporting receipt for people affected (moved into the first row) by this issue. Let p be the matrix for those unaffected. Then, if p 11 > p 1 > p 01, false negatives biased down, false positives biased up. Outright PIKing errors (when information wrong) have different bias. Could lead to overstatement of false negatives.

Table 1 Mis-reporting of Food Stamp Receipt, 2001 ACS, Full Sample ACS Report Administrative Receipt No Food Stamps Food Stamps Total Illinois No Food Stamps 19,630 88 19,718 4,193,387 34,883 4,228,270 91.15 0.76 91.91 99.18 0.83 100.00 97.24 12.10 91.91 Food Stamps 321 728 1,049 118,834 253,289 372,123 2.58 5.51 8.09 31.93 68.07 100.00 2.76 87.90 8.09 Total 19,951 816 20,767 4,312,222 288,172 4,600,393 93.74 6.26 100.00 93.74 6.26 100.00 100.00 100.00 100.00 Notes: The entries in each cell from top to bottom are sample count, population estimate, overall %, row %, column %. Estimates are weighted by household weight adjusted for PIK probability. 24

Table 1 Mis-reporting of Food Stamp Receipt, 2001 ACS, Full Sample ACS Report Administrative Illinois Receipt No Food Stamps Food Stamps Total Maryland No Food Stamps 9,042 33 9,075 1,880,871 9,615 1,890,485 93.39 0.48 93.86 99.49 0.51 100.00 97.66 10.92 93.86 Food Stamps 163 296 459 45,121 78,454 123,574 2.24 3.90 6.14 36.51 63.49 100.00 2.34 89.08 6.14 Total 9,205 329 9,534 1,925,991 88,069 2,014,060 95.63 4.37 100.00 95.63 4.37 100.00 100.00 100.00 100.00 Notes: The entries in each cell from top to bottom are sample count, population estimate, overall %, row %, column %. Estimates are weighted by household weight adjusted for PIK probability. 25

Table 3 Mis-reporting of Food Stamp Receipt, CPS, Full Sample CPS Report Administrative Receipt No Food Stamps Food Stamps Total Illinois 2002-2005 No Food Stamps 6,836 78 6,914 17,267,477 170,642 17,438,119 89.32 0.88 90.21 99.02 0.98 100.00 94.98 14.84 90.21 Food Stamps 452 459 911 912,736 980,703 1,918,714 4.72 5.07 9.80 48.21 51.79 100.00 5.02 85.18 9.80 Total 7,288 537 7,825 18,180,213 1,151,345 19,331,558 94.04 5.96 100.00 94.04 5.96 100.00 100.00 100.00 100.00 Notes: The entries in each cell from top to bottom are sample count, population estimate, overall %, row %, column %. Estimates are weighted by household weight adjusted for PIK probability. 26

Table 3: Misreporting of Food Stamp Receipt, SIPP, Full Sample SIPP Report Administrative Receipt No Food Stamps Food Stamps Total No Food Stamps 54731963 912735 55644698 0.925 0.015 0.941 0.984 0.016 1.000 0.986 0.251 0.941 9973 189 10162 Food Stamps 803748 2718842 3522590 0.014 0.046 0.060 0.228 0.772 1.000 0.014 0.749 0.060 165 628 793 Total 55535712 3631577 59167288 0.939 0.061 1.000 0.939 0.061 1.000 1.000 1.000 1.000 10138 817 10955 Notes: The entries in each cell from top to bottom are sample count, population estimate, overall %, row %, column %. Estimates are weighted by household weight adjusted for PIK probability. 27

Results: error determinants in CPS n False negatives more common for older households in IL, reverse MD higher income households (IL) those with fewer FSP months received those without reported PA or housing benefit receipt. those with true TANF receipt those imputed those surveyed in most recent years (strong in MD) n Smaller samples in CPS mean less precision n Many other determinants examined n False positives also vary with characteristics

Results: errors in the SIPP n False negatives more common for higher income households Non-white households those with fewer FSP months received those with longer time since received. Certain family types those who do not report TANF receipt or housing assistance receipt (not conditional on admin receipt) n Smaller samples in the SIPP mean less precision n Many other determinants examined n False positives also vary with characteristics

Table 7 Food Stamp Receipt in Survey Data and Combined Data, 2001 Illinois ACS, Probit Average Derivatives, Households with Income less than Twice the Poverty Line Survey data without imputed Equality Test p-value, with imputed Equality test p-value, without imputed Single, no children Survey data with 0.0670 imputed 0.0694 Combined 0.1164 Data 0.0901 0.1051 Single, with children 0.1076 0.0991 0.1429 0.0941 0.0424 (0.0247) (0.0252) (0.0272) Number of members under 18 0.0188 0.0130-0.0066 0.0420 0.1415 (0.0099) (0.0100) (0.0145) Number of members 18 or older 0.0027 0.0026-0.0201 0.0562 0.0529 (0.0111) (0.0106) (0.0138) Number of members PIKed 0.0145 0.0148 0.0692 0.0000 0.0000 (0.0076) (0.0078) (0.0131) Age 50-59 -0.0981-0.0943-0.0405 0.0245 0.0440 (0.0261) (0.0256) (0.0294) Age 60-69 -0.1144-0.1005-0.0806 0.2454 0.5427 (0.0278) (0.0272) (0.0320) Age >= 70-0.1641-0.1407-0.1619 0.9656 0.3037 (0.0313) (0.0307) (0.0329) White -0.0380-0.0418-0.0801 0.0053 0.0153 (0.0178) (0.0178) (0.0191) Poverty index -0.0007-0.0007-0.0007 0.5801 0.8840 Disabled (0.0001) 0.0906 (0.0001) 0.0817 (0.0001) 0.0774 0.4844 0.9183 Reported public assistance receipt 0.3189 0.2970 0.2386 0.0197 0.0969 (0.0240) (0.0240) (0.0315) Reported housing assistance receipt 0.1461 0.1322 0.1811 0.0457 0.0068 (0.0184) (0.0180) (0.0217) Observations 4,591 4,379 4,146 Joint significance test P-value 0.0000 0.0000 30

CPS substantive differences n If you follow the standard approach and use only survey data you would sharply understate participation by single parents, non-whites, and those with low incomes in IL, and those with young children in MD. n Many other CPS differences are substantial, but not significant or only weakly so. n In the CPS, you would get the time trend badly wrong, i.e. you would miss that participation is increasing over time.

SIPP substantive differences n If you follow the standard approach and use only survey data you would understate food stamp participation by households with few adults those not 30-39, nonwhites, those not employed, the disabled, those not reporting TANF receipt. n Strongly reject model of receipt determinants that uses only survey data.

Hot Deck Imputation Methods n Match observations with missing data to a donor observation n ACS: HHs (not in group quarters) put in 20 cells defined by full interactions of Family type Presence of children Poverty status Race of reference person n Done by State and lowest level of geography available n CPS: 648 cells, but at national level. n SIPP: haven t investigated methods yet.