Improving Timeliness and Quality of SILC Data through Sampling Design, Weighting and Variance Estimation

Similar documents
Intermediate Quality report Relating to the EU-SILC 2005 Operation. Austria

Final Quality Report Relating to the EU-SILC Operation Austria

7 Construction of Survey Weights

CYPRUS FINAL QUALITY REPORT

CYPRUS FINAL QUALITY REPORT

INCOME DISTRIBUTION DATA REVIEW SPAIN 1. Available data sources used for reporting on income inequality and poverty

CYPRUS FINAL QUALITY REPORT

EU-SILC: Impact Study on Comparability of National Implementations

Final Technical and Financial Implementation Report Relating to the EU-SILC 2005 Operation. Austria

FINAL QUALITY REPORT EU-SILC

Intermediate Quality Report for the Swedish EU-SILC, The 2007 cross-sectional component

Intermediate quality report EU-SILC The Netherlands

Final Quality report for the Swedish EU-SILC. The longitudinal component

Final Quality report for the Swedish EU-SILC. The longitudinal component. (Version 2)

P R E S S R E L E A S E Risk of poverty

CENTRAL STATISTICAL OFFICE OF POLAND INTERMEDIATE QUALITY REPORT ACTION ENTITLED: EU-SILC 2009

Methodological issues concerning domestic sameday trips in Austria

Central Statistical Bureau of Latvia FINAL QUALITY REPORT RELATING TO EU-SILC OPERATIONS

Cross-sectional and longitudinal weighting for the EU- SILC rotational design

Final Quality Report for the Swedish EU-SILC

Central Statistical Bureau of Latvia INTERMEDIATE QUALITY REPORT EU-SILC 2011 OPERATION IN LATVIA

Final Quality Report. Survey on Income and Living Conditions Spain (Spanish ECV 2010)

European Union Statistics on Income and Living Conditions (EU-SILC)

Background Notes SILC 2014

PRESS RELEASE INCOME INEQUALITY

The AMS-Cluster Project

Final Quality Report. Survey on Income and Living Conditions Spain (Spanish ECV 2009)

INCOME DISTRIBUTION DATA REVIEW - IRELAND

Community Survey on ICT usage in households and by individuals 2010 Metadata / Quality report

RECOMMENDATIONS AND PRACTICAL EXAMPLES FOR USING WEIGHTING

Weighting issues in EU-LFS

Sweden 2000: Survey Information

Intermediate Quality Report Swedish 2011 EU-SILC

INTERMEDIATE QUALITY REPORT

Intermediate Quality Report Swedish 2010 EU-SILC

Russia Longitudinal Monitoring Survey (RLMS) Sample Attrition, Replenishment, and Weighting in Rounds V-VII

The at-risk-of poverty rate declined to 18.3%

Simulation of an application of the Hartz-IV reform in Austria

Living Costs and Food Survey and Household Finance Survey Update and developments

EU-SILC USER DATABASE DESCRIPTION (draft)

STATISTICS ON INCOME AND LIVING CONDITIONS (EU-SILC))

VARIANCE ESTIMATION FROM CALIBRATED SAMPLES

The Statistical Office of the Slovak Republic

Parliament (National Council) Gender SPÖ % ÖVP % Greens % FPÖ % BZÖ % Total %

Family Resources Survey and related series

INCOME DISTRIBUTION DATA REVIEW POLAND

Consistent weighting of the LFS - monthly, quarterly, annual and longitdinal data

INCOME DISTRIBUTION DATA REVIEW PORTUGAL

Workshop, Lisbon, 15 October 2014 Purpose of the Workshop. Planned future developments of EU-SILC

Field Operations, Interview Protocol & Survey Weighting

Data utility metrics and disclosure risk analysis for public use files

An Imputation Model for Dropouts in Unemployment Data

Introduction to the European Union Statistics on Income and Living Conditions (EU-SILC) Dr Alvaro Martinez-Perez ICOSS Research Associate

Aspects of Sample Allocation in Business Surveys

Some aspects of using calibration in polish surveys

HELLENIC REPUBLIC HELLENIC STATISTICAL AUTHORITY

Designing a Multipurpose Longitudinal Incentive Experiment for the SIPP

HELLENIC REPUBLIC HELLENIC STATISTICAL AUTHORITY

Survey mode effects on measured income inequality

Simulation of EU-SILC Population Data: Using the R Package simpopulation

The American Panel Survey. Study Description and Technical Report Public Release 1 November 2013

Documents. Arne Andersen, Tor Morten Normann og Elisabeth Ugreninov. Intermediate Quality Report EU-SILC Norway 2006/13.

Introduction to Survey Weights for National Adult Tobacco Survey. Sean Hu, MD., MS., DrPH. Office on Smoking and Health

Tax obligations for retired staff of international organizations in Vienna

Testing A New Attrition Nonresponse Adjustment Method For SIPP

Case study: The Impact of Retirement on Subjective Well-Being in Austria: An Analysis of National EU-SILC Data

Austrian Partnership Practice:

CLS Cohort. Studies. Centre for Longitudinal. Studies CLS. Nonresponse Weight Adjustments Using Multiple Imputation for the UK Millennium Cohort Study

The use of linked administrative data to tackle non response and attrition in longitudinal studies

INTERMEDIATE QUALITY REPORT EU-SILC Norway

NATIONAL EMPLOYMENT AND SOCIAL OFFICE. QUALITY REPORT on the Structure of Earnings Survey 2006 in Hungary

Drivers of wealth inequality in euro area countries*

INTERMEDIATE QUALITY REPORT EU-SILC Norway

Random Group Variance Adjustments When Hot Deck Imputation Is Used to Compensate for Nonresponse 1

Essential phases of register-based survey processing concerning timeliness

Final Quality Report SILC2010- BELGIUM. Longitudinal report ( )

CATI-survey related to the travel habits of the Austrians

STRATEGIES FOR THE ANALYSIS OF IMPUTED DATA IN A SAMPLE SURVEY

Estimation of apparent and inactive unemployment by structural time series models

Standard Methods for Point Estimation of Indicators on Social Exclusion and Poverty using the R Package laeken

Internet Coverage and Coverage Bias in Europe: Developments Across Countries and Over Time

Quality Report Belgian SILC2010

Quality Report Belgian SILC2009

Lap-Ming Wun and Trena M. Ezzati-Rice and Robert Baskin and Janet Greenblatt and Marc Zodet and Frank Potter and Nuria Diaz-Tena and Mourad Touzani

Reconciliation of labour market statistics using macro-integration

Using registers in BE- SILC to construct income variables. Eurostat Grant: Action plan for EU-SILC improvements

Weighting in the Swiss Household Panel Technical report

BAWAG P.S.K. Mortgage Covered Bond Update

STATISTICAL YEARBOOK 2017

How does attrition affect estimates of persistent poverty rates? The case of European Union statistics on income and living conditions (EU-SILC)

Harmonized Household Budget Survey how to make it an effective supplementary tool for measuring living conditions

Sample Design of the National Population Health Survey

EBRI Databook on Employee Benefits Appendix D: Explanation of Sources

A Review of the Sampling and Calibration Methodology of the Survey on Income and Living Conditions (SILC)

Calibration Estimation under Non-response and Missing Values in Auxiliary Information

Comparing register and survey data in EU SILC 2010 in Austria

STATISTICAL YEARBOOK 2014

ASSOCIATION'S REPORT 1st half of according to IFRS

Report Date Report Currency

Efficiency and Distribution of Variance of the CPS Estimate of Month-to-Month Change

Transcription:

Thomas Glaser Nadja Lamei Richard Heuberger Statistics Austria Directorate Social Statistics Workshop on best practice for EU-SILC - London 17 September 2015 Improving Timeliness and Quality of SILC Data through Sampling Design, Weighting and Variance Estimation www.statistik.at We provide information

Prerequisites: EU-SILC in Austria "Community Statistics on Income an Living Conditions" Carried out in Austria since 2003, cross-sectional and longitudinal survey since 2004 Rotating panel design with 4 subsamples and 4 year panel duration Sample of first wave 2014: one-stage stratified probability sample from population register (ZMR) Minimum effective sample size (net- value): 4.500 households, 5.909 household surveyed in cross-section 2014 Sample survey of private households with voluntary participation Unit response rate 2014 operation: 76% first wave: 64% follow-up waves: 84% www.statistik.at slide 2 17 th September 2015

Prerequisites: Indicators Europe 2020 indicators on poverty and social exclusion: main indicator At-risk-of-poverty or social exclusion (AROPE) Measurement of Europe 2020 Targets from EU-SILC 2008 onwards Main income based indicator: At-risk-of-poverty rate (AROP) Net household income of year preceding the survey year Equivalised net household income (EPINC) Measurement period: year preceding the survey year Households with EPINC below 60% of national median of EPINC are AROP www.statistik.at slide 3 17 th September 2015

Prerequisites: Income from administrative registers Fully implemented register data use started with EU-SILC 2012 About 85% of total household income from administrative register information (remaining income components from questionnaire) Back-calculation of EU-SILC 2008-2011 to ensure unbroken time series from 2008 onwards (also in UDB) Linkage of register and survey data Encrypted personal identifier (bpk) provided by federal ministry of the interior Provision of bpk for gross sample and persons living in households Eurostat grant agreement on Improving Methodology on Sampling, Weighting, Imputation and Variance Estimation in the Austrian EU-SILC with regard to Administrative Data (final results in 2016) www.statistik.at slide 4 17 th September 2015

1. Efficiency gains by sampling www.statistik.at slide 5 17 th September 2015

Selection of first wave sample Type of sampling: one-stage stratified probability sample Sampling units: dwellings registered in the central residence register (ZMR) Sample size (gross): 3,229 households (EU-SILC 2014) Stratification criteria: - Interviewer units (geographical units below NUTS2 level) - Disproportional allocation per NUTS2 level according to expected response rates (based on average response of two preceding years) Large discrepancies of relative distribution of NUTS2-levels between population and net sample would lead to higher dispersion of weight adjustment factors www.statistik.at slide 6 17 th September 2015

Efficiency gains by sampling 25,00 Relative distribution of private households by provinces (NUTS2) for first wave of EU-SILC 2014 in % 20,00 15,00 10,00 5,00 Population Net sample with disproportinal allocation % Hypothetical net sample with proportional allocation % 0,00 Source: EU-SILC 2014 (unpublished results) www.statistik.at slide 7 17 th September 2015

Efficiency gains by sampling Income register data allow for an economic evaluation of (almost) every address in the sampling frame Selecting addresses by building strata according to household income (HINC_REG) percentiles based on register data is possible Optimal allocation (Neyman allocation, cf. Cochran 1977) based on register income distribution to gain smaller standard error of AROP Since AROP is a component of AROPE, this may also yield smaller standard error for main indicator AROPE Ongoing work will be finalised in 2016 www.statistik.at slide 8 17 th September 2015

2. Improving weighting with income register data www.statistik.at slide 9 17 th September 2015

Cross-sectional weighting procedure Three steps of SILC weighting procedure: household weight = = design weight * nonresponse weight * adjustment weight 1) Design weight: inverse selection probability (S strata) d s = N s n s 2) Unit nonresponse weight: Inverse estimated response probability rh (based on stepwise logistic regression) multiplied to design weight b h = d s rh s {1,, S} h {1,, n (r) } 3) Adjustment of weights to external sources Calibration to known marginal distributions => Calibrated household weights (first year wave): ( weight adjustment factors) www.statistik.at slide 10 17 th September 2015

Calibration using marginal distribution from administrative income registers Auxiliary information available for sampling frame and thus also for gross sample of first wave Same source for variables in sample and in marginal distribution New marginal distributions provided by variables from wage tax register: Number of persons receiving income from employment (at least 15 years old) Number of persons receiving income from old-age benefits www.statistik.at slide 11 17 th September 2015

Cross-sectional weighting procedure: Calibration to external marginal distribution Household level Province (NUTS2) Household size Tenure status Burgenland Carinthia Low er Austria Upper Austria Salzburg Styria Tyrol Vorarlberg Vienna Personal level 119,482 1 Person 1,391,569 Ow ner 1,870,325 245,103 2 Persons 1,120,747 Not ow ner 1,891,465 695,689 3 Persons 567,120 606,248 4+ Persons 682,355 230,563 524,291 308,263 157,532 874,619 Age & sex Age Men Women Austria Not Austria 0-13 578,094 548,868 6,240,604 860,356 14-34 1,116,783 1,090,455 35-64 1,770,060 1,799,043 65+ 649,693 850,378 Citizenship (persons aged 16+) Recipients of unemployment benefits Employees (at least 60 days) (persons aged 15+) Source: Statistics Austria, EU-SILC 2014, M icro-census 2014, social security and wage tax register 2013 Retirees 597,568 3,813,936 1,989,331 www.statistik.at slide 12 17 th September 2015

Unit nonresponse analysis Research question: Is there a bias in the main income based indicator (AROP) caused by selective unit nonresponse? Household income based only on income registers (HINC_REG) is used as study variable Y for nonresponse analysis Highly correlated with overall household income (HINC) Unit nonresponse rate first wave of EU-SILC 2014: 36% High rate of persons with potential data in income registers (99%) in EU-SILC 2014 www.statistik.at slide 13 17 th September 2015

Unit nonresponse analysis: definitions Nonresponse All observations are missing for units of a selected sample (Unit-Nonresponse) Nonresponse of a unit is a random variable Occurrence of unit nonresponse has a certain probability (cf. Groves et al. 2004) Bias Systematic deviation of the expected value of an estimator Y from the true value Y in the population (cf. Särndal 2003) Unit nonresponse (UNR) bias Bias caused by unit nonresponse: Systematic deviation of the expected value of an estimator Y based in respondent set r from the value Y in entire gross sample s (cf. Groves 2006, Särndal & Lundström 2005) www.statistik.at slide 14 17 th September 2015

Unit nonresponse bias analysis Comparison with design weighted estimate Estimate of absolute bias Estimate of relative bias weight dataset mean(hinc_reg) (1) d s gross sample 47566 0 0 (2) d s net sample 49161 1595 3.35% (3) (version 1) net sample 48122 557 1.17% d s 1/rh d s 1/rh d s 1/ch 1/rh d s 1/ch 1/rh (4) (version 2) net sample 48302 736 1.55% (5) (version 1) net sample 47912 346 0.73% (6) (version 2) net sample 48213 648 1.36% (7) Calibration, base weight: (2) net sample 47316-250 -0.53% (8) Calibration, base weight: (3) net sample 46905-660 -1.39% (9) Calibration, base weight: (5) net sample 46837-729 -1.53% (10) Calibration, base weight: (4) net sample 47140-425 -0.89% (11) Calibration, base weight: (6) net sample 46966-599 -1.26% Calibration, base weight (4) (only persons present in hh) net sample 46923-642 -1.35% Source: Statistics Austria, EU-SILC 2014 (unpublished results) rh - estimated respose rate ch- estimated contact rate rh - estimated adjusted response rate www.statistik.at slide 15 17 th September 2015

Unit nonresponse bias analysis Comparison with sampling frame dataset mean(hinc_reg) Estimate of absolute bias Estimate of relative bias none sampling frame 46948 0 0 (1) d s gross sample 47566 617 1.31% (2) d s net sample 49161 2213 4.71% (3) (version 1) net sample 48122 1174 2.50% d s 1/rh d s 1/rh d s 1/ch 1/rh d s 1/ch 1/rh weight (4) (version 2) net sample 48302 1354 2.88% (5) (version 1) net sample 47912 963 2.05% (6) (version 2) net sample 48213 1265 2.69% (7) Calibration, base weight: (2) net sample 47316 367 0.78% (8) Calibration, base weight: (3) net sample 46905-43 -0.09% (9) Calibration, base weight: (5) net sample 46837-111 -0.24% (10) Calibration, base weight: (4) net sample 47140 192 0.41% (11) Calibration, base weight: (6) net sample 46966 18 0.04% Calibration, base weight (4) (only persons present in hh) net sample 46923-25 -0.05% Source: Statistics Austria, EU-SILC 2014 (unpublished results) www.statistik.at slide 16 17 th September 2015

Conclusions of unit nonresponse analysis Assuming missing completely at random (MCAR) nonresponse mechanism could lead to substantial bias Modeling unit nonresponse in two steps seems to have a slight effect on reducing bias Calibration seems to reduce bias even if applied directly to the design weights Calibrated weights using register income only for persons who where actually in the net sample shows very similar results to sampling frame for HINC_REG www.statistik.at slide 17 17 th September 2015

3. Measuring improvement of precision caused by calibration weights www.statistik.at slide 18 17 th September 2015

Precision of EU-SILC indicators Variance estimation of AROP and AROPE does not include calibration effect, yielding a conservative estimate Therefore new precision requirements demand a more precise estimation of standard error (SE) for estimates of indicators p SE(p ) < p (1 p ) a N + b Austrian case for AROPE of EU-SILC 2014 (p =19.2%) SE p = 0.7011% > p(1 p) a N+b = 0.5975% Taking into account calibration variables correlated with AROP may have a reducing effect on standard error (cf. Berger/Skinner 2003, Deville 1999, Deville/Särndal 1992) www.statistik.at slide 19 17 th September 2015

Precision of EU-SILC indicators: Evaluation Instead of AROP the residuals ε n of the regression of AROP on the K calibration variables x k are used (Cf. Eurostat 2013, p. 13) AROP n = K k=1 β k x nk ε n = AROP n -AROP n Results show a reduction of the estimated standard error Standard error of AROP on person level in % dichotomous indicator linearized variable bootstrapping (1000 resamples) indicator without calibration effect 0.6262 0.4809 0.6195 residuals including calibration effect 0.6005 0.4662 0.6061 Change of standard error (in%) caused by calibration effect -4.1-3.1-2.2 Source: EU-SILC 2014 (unpublished results) www.statistik.at slide 20 17 th September 2015

Precision of EU-SILC indicators: Evaluation Using residuals AROPE n -AROPE n instead of AROPE shows a reduction of the estimated standard error by 14.7% Standard error of AROPE on person level in % dichotomous indicator indicator without calibration effect 0.7011 residuals including calibration effect 0.5981 Change of standard error (in%) caused by calibration effect -14.7 Source: EU-SILC 2014 (unpublished results) Incorporating calibration in variance estimation yields a slightly smaller estimated standard error and could make it possible to meet new precision requirements www.statistik.at slide 21 17 th September 2015

Concluding remarks The availability of income register data opens various opportunities for improving quality of indicators by sampling, weighting and variance estimation For sampling, optimal allocation may reduce standard error For weighting, marginal distributions from income registers may reduce bias For variance estimation, using residuals from the regression of calibration variables (including variables form income registers) on indicators instead of point estimators make it possible to measure this improvement in efficiency correctly in order to meet precision requirements www.statistik.at slide 22 17 th September 2015

Bibliography Berger, Y. G.; Skinner, C. J. (2003): Variance Estimation for a low income proportion. Applied Statistics 52, Part 4, 457-468. Cochran, W. G. (1977): Sampling Techniques. New York. Wiley Deville, J. C. (1999): Variance Estimation for complex statistics and estimators: linearization and residual techniques. Survey Methodology 25, 193-203. Deville, J. C.; Särndal, C.-E (1992): Calibration estimators in Survey Sampling. Eurostat (2013): Standard error estimation for the EU-SILC indicators of poverty and social exclusion. Eurostat Statistical Working Papers. Luxembourg. Groves, Robert M.; Fowler Jr., Floyd J.; Couper, Mick P.; Lepkowski, James M.; Singer, Eleanor & Tourangeau, Roger (2004): Survey Methodology. Hoboken. Wiley. Groves, Robert M. (2006): Nonresponse Rates and Nonresponse Bias in Household Surveys. Public Opinion Quarterly 70 (5, Special Issue), 646 675. DOI: 10.1093/poq/nfl033. Särndal, C.-E.; Lundström, S. (2005): Estimation in Surveys with Nonresponse. West-Sussex. Wiley. www.statistik.at slide 23 17 th September 2015

Thank you for your attention! Contact: Guglgasse 13, 1110 Wien Tel: +43 (1) 71128-7039 thomas.glaser@statistik.gv.at Any questions or comments? http://www.statistik.at www.statistik.at slide 24 17 th September 2015