Use of Administrative Data in the Italian quarterly OROS survey

Similar documents
Integrating administrative and survey data in the new Italian system for SBS: quality issues

ANNUAL QUALITY REPORT

Quality Report on the Structure of Earnings Survey 2010 in Luxembourg

Structure of earnings survey Quality Report

NATIONAL EMPLOYMENT AND SOCIAL OFFICE. QUALITY REPORT on the Structure of Earnings Survey 2006 in Hungary

QUALITY REPORT ON STRUCTURE OF EARNINGS SURVEY 2010 IN SLOVENIA

ANNUAL QUALITY REPORT

Producing monthly estimates of labour market indicators exploiting the longitudinal dimension of the LFS microdata

Structure of Earnings Survey Finland Quality evaluation report

QUALITY MEASUREMENT- EUROSTAT EXPERIENCES 1. INTRODUCTION

Seminar on Registers in Statistics - methodology and quality May, 2007 Helsinki

(Non-legislative acts) REGULATIONS

Statistics of employees subject to social insurance contributions

Use of Administrative Data in Statistics Canada s Business Surveys The Way Forward

Quality declaration - Indicators of Receivable and Payable Accounts of Merchants (Commercial

Luxembourg. Contents 1 STRUCTURAL BUSINESS STATISTICS METHODOLOGY

New functionalities of SBR a central backbone on the horizon

External Debt Statistics of Hong Kong

LABOR STATISTICS LAG BEHIND CHANGES IN THE LABOR MARKET AND IN POLICIES

Central Statistical Bureau of Latvia FINAL QUALITY REPORT RELATING TO EU-SILC OPERATIONS

Catalogue No DATA QUALITY OF INCOME DATA USING COMPUTER ASSISTED INTERVIEWING: SLID EXPERIENCE. August 1994

1 General reasons for and objectives of setting up the new SBR

REQUIREMENTS IN THE FIELD OF GENERAL ECONOMIC STATISTICS

Living Costs and Food Survey and Household Finance Survey Update and developments

External Trade by Enterprise Characteristics

26 th Meeting of the Wiesbaden Group on Business Registers - Neuchâtel, September Olivier Haag Insee. Session n 4 : Administrative data

CYPRUS FINAL QUALITY REPORT

26 th Meeting of the Wiesbaden Group on Business Registers - Neuchâtel, September KIM, Bokyoung Statistics Korea

Session 7 Eurostat 2017 SBR User Survey

2017 ESCB statistics work programme

A.1 CoP1 Professional independence / PC1 Professional independence

Potential uses of tax data in the Canadian census of agriculture PL02 ABSTRACT PAPER

Economic Life Cycle Deficit and Intergenerational Transfers in Italy: An Analysis Using National Transfer Accounts Methodology

External Trade Statistics. Enterprise Characteristics. by Eyüp Mehmet DİNÇ. Seminar on ITS Implementation of IMTS September 2011-VILNIUS

Foreign direct investment (FDI) statistics in the European Union

Organisation responsible: Statistical Office of the Slovak Republic (SO SR) Index reference period: December year t-1=100, December 2000=100

Guidelines on Statistical Business Registers

PRESS RELEASE. INDEX OF WAGES COST: 4th Quarter 2018

CYPRUS FINAL QUALITY REPORT

CYPRUS FINAL QUALITY REPORT

CONFERENCE OF EUROPEAN STATISTICIANS. Joint UNECE/Eurostat Work Session Working Paper No. 20

Improving Effectiveness in Social Security IESS. Kick-off meeting

PRESS RELEASE. INDEX OF WAGES COST: 2nd Quarter 2018

BACKGROUND PAPER OF CHINA FOR OECD SHORT-TERM ECONOMIC STATISTICS EXPERT GROUP MEETING SEPTEMBER 2009, PARIS

Community Survey on ICT usage in households and by individuals 2010 Metadata / Quality report

Earnings and Labour Costs

Description of the Sample and Limitations of the Data

Some reflections on the comments and recommendations on the project Data fusion of EU-SILC and HBS at ISTAT

Organisation responsible: National Institute of Statistics, Bucharest

Croatian Quarterly National Accounts Inventory based on ESA 2010 methodology

THE CENTRALISED SECURITIES DATABASE IN BRIEF

ANNEXES. to the proposal for a REGULATION OF THE EUROPEAN PARLIAMENT AND THE COUNCIL

QUALITY REPORT ESSPROS PENSION BENEFICIARIES MEMBER STATE: REFERENCE YEAR: 2010

Quality report concerning statistics underlying the Macroeconomic Imbalance Procedure (MIP) indicators level 3

Workshop, Lisbon, 15 October 2014 Purpose of the Workshop. Planned future developments of EU-SILC

PRESS RELEASE. INDEX OF WAGES FOR THE WHOLE ECONOMY: 4 th Quarter 2016

REVIEW OF THE REQUIREMENTS IN THE FIELD OF GENERAL ECONOMIC STATISTICS DECEMBER 2004

On exports stability: the role of product and geographical diversification

Productivity analysis from a European perspective Isabelle Rémond-Tiedrez Christine Gerstberger

The generic template for b.o.p/i.i.p. statistics as provided by the Czech Republic (the Czech National Bank)

Part C. Impact on sample design

Quality and accuracy of business register and links to administrative sources. Applicability of the estimation model for enterprises below thresholds

Call for papers. Session No. 1: Country Progress Reports

Continuing to unlock the potential of new and existing data sources

Quarterly National Accounts Inventory Croatia

Intermediate Quality Report for the Swedish EU-SILC, The 2007 cross-sectional component

Use of PPIs for service industries as deflators in an index of services production

PRESS RELEASE. QUARTERLY NATIONAL ACCOUNTS: 1 st. Quarter 2017 (Flash Estimates)

Selling to Foreign Markets: a Portrait of OECD Exporters. by Sónia Araújo and Eric Gonnard. Unlocking the potential of trade microdata

Services producer prices, 1st quarter 2012

Statistical Data Mining for Computational Financial Modeling

Financing Small and Medium Enterprises: Challenges and Options

Environmentally related taxes by economic activity. First quality report following first data transmission under Regulation (EU) 691/2011

Structure of Earnings Survey 2010 Quality Report (Commission Regulation (EC) 698/2006)

A new approach to education PPPs in the Eurostat/OECD exercise

Life After Census Belfast - 9 May Beyond The future of population statistics. Alistair Calder Head of Stakeholder Engagement, ONS

FINAL QUALITY REPORT EU-SILC

Reducing trade in services data asymmetries

Informal Economy in National Accounts of Russia. Natalia Ustinova

Correcting for Coverage Errors in Enterprise Surveys A Register-based Approach Anders Wallgren, Britt Wallgren, Statistics Sweden

Statistics of the banks profit and loss accounts

3 Labour Costs. Cost of Employing Labour Across Advanced EU Economies (EU15) Indicator 3.1a

3 Labour Costs. Cost of Employing Labour Across Advanced EU Economies (EU15) Indicator 3.1a

International Labour Office Department of Statistics

All about Permanent Account Number (PAN) and how it is structured

Synthesizing Housing Units for the American Community Survey

Twinning, social-statistics Israel Denmark. Social statistics

Item 3.2 COMPLIANCE MONITORING

Labour market. Third quarter of 2017

Labour market. Second quarter of 2017

GIES 2008: Measuring Innovation

7: Register-based Census in the Nordic Countries

Direct Investment Compilation Practices, Data Sources and Methodology

Methodologies and Working papers. PEEIs in focus. A summary for the industrial production index edition

EUROPEAN COMMISSION EUROSTAT. Directorate G: Global Business Statistics Unit G2: Structural business statistics and global value chains

June Introduction Relevance of the database Extractions Hits Completeness...6

SECTION 2.1. REAL SECTOR National Accounts

Online appendix to Chapter 2: Growth, tangible and intangible investment in the EU and US before and since the Great Recession 1

Administrative Data in Statistics Canada s Business Surveys: the Present and the Future

IPR-intensive industries: contribution to economic performance and employment in the European Union

Transcription:

Use of Administrative Data in the Italian quarterly OROS survey Fabio Massimo Rapiti Short-Term Statistics on Employment and Labour Incomes Central Directorate for Short-Term Business Statistics Istat OECD STESEG MEETING Paris 27-28 June 2005 1

Outline of presentation 1. Italian background and main characteristics of the OROS survey; 2. why administrative data; 3. the INPS data: content and timing; 4. the treatment of data (retrieving variables; check; editing); 5. estimation methodology; 6. some recent changes; 7. final remarks. 2

Use of administrative data at Istat Compare to other NSOs Istat is a latecomer as administrative data user (no tradition, no trust). Recently Istat has made some steps towards use of administrative sources. Two examples: Business Register ASIA (Archivio Statistico delle Imprese Attive - statistical archive of active businesses): six administrative data sources; annual businesses accounting data (Financial statements register) captured by the Chambers of Commerce or other intermediaries. 3

Main characteristics of the OROS Survey The OROS short term survey was designed to fill a crucial gap in Italian statistics and meet EU Regulations (STS, LCI- Labour Cost Index); OROS stands for Occupazione (Employment), Retribuzioni (Wages), Oneri Sociali (Other labour cost); the aim is to produce quarterly information on the evolution (and levels) of gross wage, other labour cost and employment; the OROS survey uses administrative data (INPS-National Social Security Institute) for Small and Medium Enterprises (SME); the SME estimation from administrative data is combined with the data coming from Istat Large Enterprises (LE) monthly census survey (>500 employees). Every quarter two new estimations are released: the preliminary estimate based on a non-random sample of INPS data, with a delay of about 75 days from the reference quarter, and a revised estimate, called final, based on the total population of INPS data, with a delay of 15 months from the reference quarter. 4

Development of the OROS project Years 1999 Start of the project Activity 2000-01 2002 2003 2004 2005 design and development of survey method and procedures first preliminary release (100-90 days delay ) of three OROS indicators at national level: wage, other labour costs, total labour cost per FTE unit regular release of OROS indicators (90-80 days delay) EU STS Regulation delivery to Eurostat EU LCI Reg. (75-70 days delay); in autumn release OROS index of number of jobs Only after using data for 3 years we learned how to cope with the more peculiar and subtle shortcomings of the admin. data. 5

In the past short term statistics for employment, wages, labour costs and hours worked were based on monthly business survey, covering firms with more than 500 employees (accounting for 23% of total wage employment). EU short term statistics requirements (late 90s) STS Regulation employment; gross wages; hours worked; LCI (Labour cost index) hourly gross wages; hourly other labour cost; hourly total labour cost; coverage of all enterprises with employees; C to I +K (two digits Nace rev.1) for employment, C to F for wages and hours worked; 90 days (70 or less in future). coverage of all enterprises with employees; C to K (sections Nace rev. 1); 70 days. 6

Why administrative data? (1) Given the Italian structure of firms more than 1,1 million of firms with employees, mainly small or very small firms; almost 40% of total wage employment in firms with less then 20 employees; Very high firms turnover (birth and death). A traditional sample survey could have been too large and too costly Only using administrative data Istat could : meet the requirements of the UE short-term regulations (coverage, quality, timeliness); without increasing enormously the statistical burden on firms. 7

Why administrative data? (3) The availability in electronic form of a mass quantity of INPS data has stimulated Istat to tune strategy from a typical one collection-for one single survey to focus on data source (the wage and contributive system) which can be used for many statistical objectives. 8

short term economic statistics (STS, LCI) other economic and social statistics (Non Pension Cash Benefits, etc.) OROS DATABASE input for annual economic statistics (SBS), National Account, etc (also for editing and imputation) Satellite Register on employment (ASO) 9

The INPS (National Social Security Institute) two archives All Italian non-agriculture firms in the private sector, with at least one employee (roughly 10 million employees and 1.3 million employers per year), have to pay social security contributions to INPS. INPS register identification firm code (as administrative entity), fiscal code, name, address, legal form, dates of registration and cancellation, INPS industry code, etc. Employers monthly declaration (DM10 form) identification number of the firm, total monthly employment and the associated wage-bill, paid days, overtime hours, social contributions, etc. 10

1,150,000 900,000 650,000 400,000 150,000 DM10 forms in electronic mode Time serie of the number of DM transmitted by electronic mode (the non-random sample) gen-00 m a r-0 0 m a g -0 0 lu g -0 0 set-00 nov-00 gen-01 m a r-0 1 m a g -0 1 lu g -0 1 set-01 nov-01 gen-02 m a r-0 2 m a g -0 2 lu g -0 2 set-02 nov-02 gen-03 m a r-0 3 m a g -0 3 DM10 forms were delivered to INPS in different modes (electronic and nonelectronic); more and more firms used electronic communication mode (from 2001, Internet); DM10 forms are usually available at the local INPS offices 30 days after the reference month; INPS collects in a special file all the electronic forms and transmits them to ISTAT after about 45 days from the end of the reference quarter. 11

DM10 forms in electronic mode (2): the (non random) sample The sample is, obviously, non random but : it is extremely large (about 1 million units); it covers all firm sizes, economic activities and geographical areas; it represents new births; once the firms enter in the sample they normally do not exit (they do not change delivering mode). Istat uses them: to produce a preliminary estimate of current quarter t. 12

Total population of DM10: the universe Because of delays in the delivery and registration of the non-electronic DM10s, INPS transmits to Istat the complete information about the whole firm population referring to month t (1,3 million of units per month) only after 13-14 months; Istat uses them: as auxiliary information (referred to t-4) to improve the preliminary estimate of current quarter t; to produce a final (census) estimate of quarter t-5. 13

Procedures Flow every quarter there are two parallel processes: preliminary t and final estimate t-5 Raw Data Retrieval of statistical variables and preliminary checks Cross-sectional and longitudinal microediting on monthly data Preliminary and census estimates for the SME s Imputation of unit non response (only for census estimate) Month to quarter aggregation, economic activity coding (link to BR) Macro-data check, identification of the series probably affected by outliers (Tramo-Seats) Selective editing on the micro-data Integration of SME s and LE s estimates Final Results 14

Retrieving the statistical variables The translation of the administrative data into the required statistical variables imply complex computational aspects. There are many items in the DM10 form and more than 400 codes have to be identified (related to type of employment and associated wage bills, contributions, credit terms etc..). We had to build a metadata database of codes and to keep it up-todate to account for new codes and suppression of old ones. Unfortunatly italian social security rules change continuosly Survey variable Gross wages estimation DM10 gross wage (as a result of aggregation inside the single form of wages related to different type of employment) Number of jobs Other labour cost DM10 jobs (as a result of aggregation inside the single form of different type of employment) DM10 total debit of the unit (DM10wage* worker social security rate) + estimation of other labour costs (INAIL, TFR) 15

Retrieving the statistical variables (2) Retrieving Other Labour Cost (OLC) has not been an easy task; we have at least two kind of problems: 1) we can get from DM10 only total (employer + employee) social contributions due to INPS: we have to identify employee contributions, already included in the gross wages, according to the legal rates; 2) only a part (though the largest one) of the OLC is recorded in the DM10: we need to impute the other labour cost (e.g. Employers injuries insurance premiums - INAIL, severance payment - TFR). We had to build a metadata database of the legal rates and to keep it up-to-date 16

Check and Editing Check and editing Strategy Microediting at monthly level data: Interactive for top 100 firms; Very selective and interactive over the firms with larger wage bills (we assign a score). cross-sectional coherence and longitudinal (month to month) checks. Macro Series check (after the estimates have been produced): Comparison with other similar quarterly indicators (QNA); automatic detection of outlier in the time series (Tramo ERROR). Selective editing on the anomalous sectors: identification of the units that have influenced most the change of the series; correction, if needed. 17

Imputation Imputation of unit non-response (only for census estimate) separating the actual non-responses from true absences of the DM10s due to temporary inactivity of the enterprise (seasonal activity), or to the death of the unit (to avoid over-imputation); using mainly the pattern of presence of the DM10s to identify the non-response of the units; 18

Target estimation methodology Information available from INPS is: INPS register (available at the end of each quarter); the DM10 sample (available quarterly after 45 days); the DM10 universe (available monthly after 13-14 months); short terms surveys have the problem to estimate levels and changes with reference to the current population without a current Business Register. In our case the BR (ASIA) has up to two years delay; given the huge size and timeliness of the sample data and the availability of the INPS register we developed an methodology to estimate the level and trend of the variables in the current population and not based on a fixed population in the past; at this stage the BR (ASIA) is used only to get the right Nace rev.1 economic classification. 19

Target estimation methodology (2) In our estimates we assume that a relation exists between the target variables at t and the same variables at t-4 (the auxiliary variables); the estimates are obtained applying to each unit in the sample a weight; weights are calculated to satisfy the condition that the auxiliary variables of the units in the sample multiplied by these expanding factors are equal to the known totals of the auxiliary variables (calibration); these totals are obtained by summing up over the current population the auxiliary variables available in the universe of t-4; 20

More on the preliminary estimates the calibration is performed at model groups level. The non randomness of the sample is faced with: the partition of the population in model groups (50 economic activities, 4 firm size classes, 4 geographical areas, 2 age of firm classes); the calibration of the weights. 21

Recent change in the size of the sample Since spring 2004 for administrative reason for all firm is compulsory to use fast delivering mode: internet; INPS do not accept anymore paper declaration. 1400000 1100000 POPULATION 800000 SAMPLE 500000 200000 Drop in the data for INPS administrative and technical reason Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 2000 2001 2002 2003 2004 2005 22

Implication of this change (sample = population) The target variables for the population can simply be calculated by summing up the data (no need of calibration methodology); There are potentially almost no limitation in the details (economic activity, size, etc) for tabulating and releasing the aggregate results; Still we need a calibration (or any other) methodology in case of a new drop in the data to expand to the population. A project to better the calibration methodology is ongoing. 23

Summary of the INPS administrative data weacknes and the methodology used to manage them Weakness Timeliness Missing data Under coverage (just few large units) Definitions differ Incomplete set of variables Measurement errors Wrong or missing economic activity code Preliminary (70 days) Calibration, model groups Calibration, model groups Integration with LES Not deemed necessary Estimation procedure using external information Selective micro editing at monthly level Selective micro editing at quarterly level Link (through fiscal code) to the BR to get the right Nace rev.1 economic classification Method for OROS - Final (365+140 days) Imputation Integration with LES Not deemed necessary Estimation procedure using external information Selective micro editing at quarterly level Link (through fiscal code) to the BR to get the right Nace rev.1 economic classification 24

Advantages very low collection cost; complete firm size coverage, timeliness, estimates precision a lot of different kind of data (can be used in different ways); no statistical burden on firms. Disadvantages Final remarks (1) huge handling of data (millions of records every month); very complex process of production in a very short schedule; complete dependence from INPS; (relative) risk of inconsistency and discontinuity of the information over time. commitment to co-operate; framework Istat-INPS agreement; high level co-ordination committee. 25

Final remarks (2) in OROS not only big, but also small admin. change may jeopardise the quarterly release; strong link with admin. data suppliers (persons ready to respond to any different kind of questions relative to the data); in short term statistics we have to achive and maintain: Timeliness and punctuality; We cannot waste time; We have to prevent any unexpected kind of problem in advance: delays in delivery or quality problem (technological; administrative) Be prepared for alternative solutions (rescue net). 26

Thank you fabio.rapiti@istat.it oros-info@istat.it 27