The Dynamic Cross-sectional Microsimulation Model MOSART

Similar documents
A more efficient sampling procedure, using loaded probabilities

Accumulated pension entitlements in Norway

Life Time Pension Benefits Relative to Life Time Contributions

Favourable methods for labour market projections

An Oracle White Paper February Temporal Reasoning: Manage Complex Changes in Rules, Rates, and Circumstances

JADE LICENSING DOCUME N T V E R S I O N 1 2 JADE SOFTWARE CORPORATION

A Genetic Algorithm for the Calibration of a Micro- Simulation Model Omar Baqueiro Espinosa

Kyrre Stensnes and Nils Martin Stølen

Modelling economic scenarios for IFRS 9 impairment calculations. Keith Church 4most (Europe) Ltd AUGUST 2017

T-DYMM: Background and Challenges

EDUCATION EMPLOYMENT & TRANSITION THE AUSTRALIAN LONGITUDINAL SURVEY PROGRAM. Peter Boal. Geoff Parkinson. l.introduction

v1.6 (changes from PI + v1.5)

Topic 2: Define Key Inputs and Input-to-Output Logic

Multistate Demography with R? Samir K.C. World Population Program - IIASA

Reforming Public Service Pensions

Better decision making under uncertain conditions using Monte Carlo Simulation

PWBM WORKING PAPER SERIES MATCHING IRS STATISTICS OF INCOME TAX FILER RETURNS WITH PWBM SIMULATOR MICRO-DATA OUTPUT.

Population, Labourforce and Housing Demand Projections

How Much Should Americans Be Saving for Retirement?

Assessing Solvency by Brute Force is Computationally Tractable

Individual Asset Transfer

World Social Security Report 2010/11 Providing coverage in times of crisis and beyond

Reinsurance in Taiwan, Key Trends and Opportunities to 2017

FPS Briefcase. User Guide

Comparison of Logit Models to Machine Learning Algorithms for Modeling Individual Daily Activity Patterns

Pension Fiche - Norway October 2017

An Improved Framework for Assessing the Risks Arising from Elevated Household Debt

REPUBLIC OF CROATIA MINISTRY OF LABOUR AND PENSION SYSTEM Croatian Pension Insurance Institute. Croatia Country fiche on pension projections

KEY WORDS: Microsimulation, Validation, Health Care Reform, Expenditures

Sizing Strategies in Scarce Environments

Stochastic Modelling: The power behind effective financial planning. Better Outcomes For All. Good for the consumer. Good for the Industry.

DRAFT. A microsimulation analysis of public and private policies aimed at increasing the age of retirement 1. April Jeff Carr and André Léonard

STOCHASTIC COST ESTIMATION AND RISK ANALYSIS IN MANAGING SOFTWARE PROJECTS

Retirement. Optimal Asset Allocation in Retirement: A Downside Risk Perspective. JUne W. Van Harlow, Ph.D., CFA Director of Research ABSTRACT

REPUBLIC OF BULGARIA. Country fiche on pension projections

Methods and Data for Developing Coordinated Population Forecasts

FINAL QUALITY REPORT EU-SILC

Transfer Pricing Country Summary Norway

STOCHASTIC COST ESTIMATION AND RISK ANALYSIS IN MANAGING SOFTWARE PROJECTS

Expert4x NoWorries EA. November 21, 2017

Loan Approval and Quality Prediction in the Lending Club Marketplace

PENSIM Overview. Martin Holmer, Asa Janney, Bob Cohen Policy Simulation Group. for

Solutions exercises instruction 1

Accelerated Option Pricing Multiple Scenarios

Effects of the Australian New Tax System on Government Expenditure; With and without Accounting for Behavioural Changes

Capital Stock Measurement in New Zealand

Quick Start Guide SYSTEM REQUIREMENTS GETTING STARTED NAVIGATION THE WIZARD

Her Majesty the Queen in Right of Canada (2017) All rights reserved

Making sense of Schedule Risk Analysis

RISK ADVICE AND INSURANCE

Ministry of Health, Labour and Welfare Statistics and Information Department

Projection Assumption Standards

CHAPTER 11 CONCLUDING COMMENTS

Benchmarks Open Questions and DOL Benchmarks

Data and Methods in FMLA Research Evidence

Load Test Report. Moscow Exchange Trading & Clearing Systems. 07 October Contents. Testing objectives... 2 Main results... 2

PENSIONS POLICY INSTITUTE. Automatic enrolment changes

Curve fitting for calculating SCR under Solvency II

Canadian Partnership Against Cancer - Who We are

Journal of Insurance and Financial Management, Vol. 1, Issue 4 (2016)

B1.02: LIFE POLICY TYPES

Projection Assumption Standards

IAA STANDARD OF PRACTICE for actuarial advice provided with respect to SOCIAL SECURITY SCHEMES

MDGs Example from Latin America

MEASURING PORTFOLIO RISKS USING CONDITIONAL COPULA-AR-GARCH MODEL

Reference Guide TESTAMENTARY TRUSTS

STATUS QUO AND PROBLEM

Documentation note. IV quarter 2008 Inconsistent measure of non-life insurance risk under QIS IV and III

Long-term Public Finance Projections

CHAPTER 7 U. S. SOCIAL SECURITY ADMINISTRATION OFFICE OF THE ACTUARY PROJECTIONS METHODOLOGY

Saving for Retirement: Household Bargaining and Household Net Worth

Core methodology I: Sector analysis of MDG determinants

The private long-term care (LTC) insurance industry continues

Multiple steps: Subrogation involves more than 150 activities, tasks, calculations, systems interactions and collaborative inputs over time.

An overview of the financial profile fact finder

Social Security Planning Strategies

Appendix 1V Baby Boomer Contemplating Retirement

Murabaha Creation Oracle FLEXCUBE Universal Banking Release [December] [2012] Oracle Part Number E

SPAIN According to the Centre for Tax and Policy and Administration, the 2007 AW level is EUR

UPDATED IAA EDUCATION SYLLABUS

PROJECTED BENEFIT ILLUSTRATIONS IN CONNECTION WITH RETIREMENT PLAN AMENDMENTS. Comment Deadline November 30, 2000

COS 318: Operating Systems. CPU Scheduling. Jaswinder Pal Singh Computer Science Department Princeton University

Marital Disruption and the Risk of Loosing Health Insurance Coverage. Extended Abstract. James B. Kirby. Agency for Healthcare Research and Quality

EstimatingFederalIncomeTaxBurdens. (PSID)FamiliesUsingtheNationalBureau of EconomicResearchTAXSIMModel

An Analysis of Public and Private Sector Earnings in Ireland

St. Kitts and Nevis Economic Citizenship

Pension projections Denmark (AWG)

Evaluation of the Uganda Social Assistance Grants For Empowerment (SAGE) Programme. What s going on?

COS 318: Operating Systems. CPU Scheduling. Today s Topics. CPU Scheduler. Preemptive and Non-Preemptive Scheduling

Pennyborn s Living Trust Checklist Page 1 of 7 INSTRUCTIONS FOR USING PENNYBORN S LIVING TRUST CHECKLIST

Introduction. The size of or number of individuals in a population at time t is N t.

Enterprise risk management has been

Michael Clive Gibson Resume

Ram M. Pendyala and Karthik C. Konduri School of Sustainable Engineering and the Built Environment Arizona State University, Tempe

Distribution of state of nature: Main problem

Domokos Vermes. Min Zhao

Heuristics in Rostering for Call Centres

FAILURE RATE TRENDS IN AN AGING POPULATION MONTE CARLO APPROACH

Anne Bracy CS 3410 Computer Science Cornell University

Homeowners Ratemaking Revisited

Transcription:

Third General Conference of the International Microsimulation Association Stockholm, June 8-10, 2011 The Dynamic Cross-sectional Microsimulation Model MOSART Dennis Fredriksen, Pål Knudsen and Nils Martin Stølen Statistics Norway ABSTRACT: MOSART is an acronym for Model for microsimulation of Education, Labour supply and Social security. The model uses either the entire or a representative sample of the population in a base year and simulates the further life course for each person. In addition to research projects in Statistics Norway, The Ministry of Finance and The Ministry of Labour are the main users of the model. MOSART has extensively been used in the recent process of reforming the Norwegian public pension system. This paper provides a brief overview of the model, with emphasis on technical aspects and the base population. Address: Research Department, Unit for Public Economics Statistics Norway P.O. Box 8131 Dep. N-0033 Oslo, Norway Email: dennis.fredriksen@ssb.no, pal.knudsen@ssb.no, nils.martin.stolen@ssb.no 1

OBJECTIVE OF THE MODEL MOSART is a dynamic microsimulation model with a cross-section of the Norwegian population and a comprehensive set of characteristics. The model starts with either the entire population or a representative sample of the population in a base year (currently 2005) and simulates the further life course for each individual in this initial population. Transition probabilities depending on individual characteristics are estimated from observed transitions in a recent period. Events included in the simulation are migration, deaths, births, household formation, educational activities, retirement, labour force participation, income and wealth. Public pension benefits are calculated from the simulated labour market earnings and other characteristics included in the simulation according to an accurate description of the public pension system (the National Insurance Scheme Folketrygden ). The pensions covered by the model include old age pensions, disability pensions, survival pensions and early retirement benefits. Changes in the pension system may be analysed by calculating several pension systems in parallel while keeping the stochastic events constant. TARGET AUDIENCE MOSART is operated at Statistics Norway due to technical obstacles, restrictions from the Data Authorities regarding the merged administrative registers and because understanding the full meaning of changing a parameter requires detailed knowledge of the model. In addition to analyses requested by internal research projects, the main users are the Ministry of Finance and the Ministry of Labour. Users in these ministries are either former model developers themselves or economists with a realistic sense of how economic models work. They are therefore critical and capable users of the simulation results, and this has proved beneficial to the development and validation of the model. For this reason we can also transfer the results with a low degree of preparation, often as simple tables supported by some verbal explanations. Other public institutions, private organisations and media use the results from the MOSART model occasionally. In these cases the results are handed over with a higher degree of preparation. BASE POPULATION The base population has recently been updated and now includes the entire Norwegian population. The base year is currently 2005. To be able to compute benefits for surviving 2

spouses and inheritance, diseased and emigrated persons are included. The total number of people in the base population is 7.16 million. For convenience we have generated random samples of 0.1, 1 and 10 per cent of this population. These samples are mostly used for debugging and testing purposes, especially the two smallest. All samples are stratified by gender, age, birth histories and household status. The samples include both spouses from all married couples and from cohabitating couples with children. The data is collected from various administrative registers in the Directorate of Taxes, the National Insurance Administration and Statistics Norway. The underlying demographic assumptions of the model are based on public population projections from Statistics Norway. The information is represented as annual data going back as far as possible, Table 1 itemises the various data sources along with which variables are gathered and the earliest possible start for the time series. Table 1: Data sources for the base population. Source Variable Start Directorate of Taxes Gender, year of birth, spouse, 1964 mother and father, marital status, country of birth, year of migration (if any), home address. National Insurance Degree of disability. 1991 Administration National Insurance Pension status, time for 1967 Administration disability. Directorate of Taxes Labour income, wealth. 1967 Statistics Norway Educational activities, completed education. 1974 In addition to being the starting point of the simulation, the initial population is also used to estimate the transition probabilities. These probabilities may be adjusted to make the expected number of simulated events equal to some external constraints, for example the historical number of events in the same year. The underlying assumptions are generally kept up to date by using adjustment factors from the last year with historical data at an aggregate level. This is the case for aggregate observations regarding migration, periodic life expectancy at birth by gender, number of births, number of pupils and students by gender and age group, number of early retirees, retirement age, number of persons in the labour force and man-years by gender, 3

total labour market earnings by gender, the basic pension unit, and rules for calculating pension entitlements and benefits. At present the model is calibrated to annual data from 2009. When calibrating to new annual data we assume that the effects from different explanatory variables (gender, age, education etc.) on the transition probabilities are the same as estimated from the initial population, and that the adjustment factors capture the interesting part of time variation. The model is extensively documented in Fredriksen (1998). METHOD AND PLATFORM Being programmed in C# the model is truly multi-platform, as compilers for C# exists for virtually every operating system. This makes it possible to run the model on any available hardware. We run the model on both Linux and Microsoft Windows. On Linux we use the compiler provided by the Mono project. Mono is an open source project providing software to develop and run.net applications. Our experiences with the services from this compiler have been excellent. On the Microsoft Windows platform we use the free compiler and development tool Visual C# Express. This tool includes access to the MSDN library, which is very beneficial when programming large applications. Both compilers support version 4.0 of the.net framework. As the size of the base population is relatively large, a powerful computer is required. When transfer from disk to memory is completed it occupies approximately 30 GB of RAM in the base year, growing to 60 1 GB in year 2200. We are currently using a Linux-based server (conveniently named Amadeus) with 16 processors and 256 GB RAM. This enables exploiting the benefits of multi-threading, to be discussed later. As illustrated in Figure 1, the three main stages of the application running the model are: 1. Read data files and transition probabilities. Set up tables and data structures. 2. Perform calculations based on transition and event probabilities. 3. Print results for the year simulated. Advance to next year and resort lists. 1 This depends on the assumptions, especially regarding population growth and number of pension systems. 4

Figure 1: Logical data flow. During a simulation Step 1 above is performed only once, while Steps 2 and 3 are repeated every simulation-year. The input data and the transition probabilities are provided as space-delimited ASCII-files. This makes it straightforward for the user to verify the contents of the files. In addition to input-files there are a few parameter-files where the user can set global variables for the simulation. This is information like i.e. the end year of the simulation, mortality and fertility rates and pension rules. These files are also space-delimited ASCII-files. The output from a simulation consist of extensive self-documentation (making the user able to find errors in the results afterwards), a set of standard tables produced by the simulation programme with aggregated figures covering most frequently asked questions and an option to produce a model population consisting of an ASCII-file with one record per selected person per selected year with selected variables. To produce special tables from this file one has to use a suitable table production programme like SAS. RECENT TECHNICAL ADVANCEMENTS New computers have multiple cores, i.e. the ability to perform calculations simultaneously. MOSART was originally programmed in a traditional style where only one event or calculation was handled at a time. Multiple cores did not reduce runtime with this approach. 5

The new base population included 10-100 times as many persons as the former 2, and this made runtimes matter. For this reason we shifted MOSART towards multithreading. Each step of the simulation is now split into fixed set of 'jobs', e.g. the simulation of disability by groups of gender and birth year. It is mandatory that each such 'job' have no interactions whatsoever with any of the other 'jobs' 3. This requires a tidy programming style. A special problem is that each 'job' must have its own random-seed for the chosen number of cores to avoid multithreading to influence the simulation result, and make it impossible to reproduce a simulation by using identical random-seed. If the splitting into 'jobs' can be done at a higher level, the effect on the source code is moderate. This also implies that simulation steps which include multiple repeated interactions within the entire population are of no use to multithread (e.g. household formation). The simulation is carried out by specifying the number of threads (i.e. the number of cores in the computer, if this simulation is the only task at the moment). Each thread (core) will at each simulation step pick up the next 'job' in line, and repeat this until no more 'jobs' are available at the present step. With several more 'jobs' than threads, this will engage all threads fairly efficient. With a large population the run time for most multithreaded simulation steps are reduced with a factor close to the number of threads. E.g., tax calculations respond efficiently to 12 threads, and is a simulation step which is easily split into separate jobs (no interactions between tax units, i.e. households), little allocation of new memory and many trivial calculations. Some simulation steps do however not respond to multithreading at all, or they may respond only to 2-3 threads and thereby gain very little from a large number of threads. Simulation steps involved in household formation is one clear category; they are both cumbersome to multithread due to often subtle interactions between individuals, and with little or none effect on runtime. Another category is simulation steps with heavy allocation of memory, especially those which triggers memory management. A major problem is semi permanent arrays, lists 2 Prior to multithreading MOSART, the standard simulation included 1 per cent of the Norwegian population, even though we for special purposes used 12 per cent. 3 Interaction may be handled through special synchronization primitives, e.g. locks, but the general effect is very often that reductions in runtimes are lost. 6

and objects. We are still working on these aspects, searching for an understanding of what constitutes efficient programming in a multicore environment. Another approach is tasks, which handles all the administration of generating threads and assigning 'jobs' (each iteration is a 'job'). The effect on the source code is minimal (easy to implement). We are currently experimenting with this, either as an alternative to traditional multithreading or as a supplement. Our major problem so far is keeping the random-generator unaffected by the number of threads. One example is that while tasks are efficient at adding up individual variables, the sum is unfortunately affected. This is the case where the sum and each item in the sum have the same precision level, because the order of adding up will affect the rounding process. The effect is not large, but still sufficient to affect the random generator at some stage. We solved this by rounding all individual variables before adding, and then the sum itself afterwards. Multithreading has reduced the runtime with a factor of 4-5 when 6 threads are employed. Increasing the number of threads further has shown far less effect. Some parts of the simulation are not multithreaded, and their relative importance increases with the number of threads. Another problem is as mentioned memory management. We have also experimented with memory-mapped files. Due to the large size of the base population, it is inconvenient to load it into memory every time a simulation run is to be performed. By keeping it permanently stored in memory we avoid reading from disk, which is very slow compared to a memory-to-memory transfer. This approach significantly reduces the time used to initiate a simulation. A memory-mapped file can easily be shared among different simulations and it rarely changes. POLICY ENVIRONMENT In the last couple of years the MOSART model has been intensively used in analysing effects from reforms of the Norwegian National Insurance Scheme. As in many other countries the pension system in Norway is rather complicated, including non-linearities regarding the accumulation of pension entitlements. A microsimulation model including demographic characteristics, labour supply and an accurate description of the pension system therefore seems to be the most appropriate tool to obtain precise estimates of the direct effects on 7

individual benefits, government expenditures and the future pension burden. Some of the experiences from using the model to analyse these effects can be found in Fredriksen and Stølen (2007), where it is shown that results from the MOSART model have had a direct impact in the design of the new pension system. REFERENCES Fredriksen D (1998) Projections of Population, Education, Labour Supply and Public Pension Benefits. Social and Economic Studies 101, Oslo: Statistics Norway. Fredriksen D and Stølen N M (2007) Effects of Demographic Developments, Labour Supply and Pension Reforms on the Future Pension Burdon in Norway, in Harding A and Gupta A (Eds.), Modelling our future: Population ageing, social security and taxation, Oxford: Elsevier, 81-106. 8