MODELLING INCOME DISTRIBUTION IN SLOVAKIA

Similar documents
MODELLING OF INCOME AND WAGE DISTRIBUTION USING THE METHOD OF L-MOMENTS OF PARAMETER ESTIMATION

ANALYSIS OF THE DISTRIBUTION OF INCOME IN RECENT YEARS IN THE CZECH REPUBLIC BY REGION

EX-POST VERIFICATION OF PREDICTION MODELS OF WAGE DISTRIBUTIONS

THE USE OF THE LOGNORMAL DISTRIBUTION IN ANALYZING INCOMES

Application of the L-Moment Method when Modelling the Income Distribution in the Czech Republic

Lecture 3: Probability Distributions (cont d)

**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:

Modelling insured catastrophe losses

Monte Carlo Simulation (Random Number Generation)

Process capability estimation for non normal quality characteristics: A comparison of Clements, Burr and Box Cox Methods

METHODS FOR SUMMARIZING AND COMPARING WEALTH DISTRIBUTIONS. Stephen P. Jenkins and Markus Jäntti. ISER Working Paper Number

Random Variables and Probability Distributions

Power Law Tails in the Italian Personal Income Distribution

The Application of the Theory of Power Law Distributions to U.S. Wealth Accumulation INTRODUCTION DATA

Simulation of Moment, Cumulant, Kurtosis and the Characteristics Function of Dagum Distribution

Poverty and Income Distribution

Modelling of extreme losses in natural disasters

PARAMETRIC AND NON-PARAMETRIC BOOTSTRAP: A SIMULATION STUDY FOR A LINEAR REGRESSION WITH RESIDUALS FROM A MIXTURE OF LAPLACE DISTRIBUTIONS

Statistics 431 Spring 2007 P. Shaman. Preliminaries

REVISION OF THE CONCEPT OF MEASURING MATERIAL DEPRIVATION IN THE EU

INCOME INEQUALITY AND OTHER FORMS OF INEQUALITY. Sandip Sarkar & Balwant Singh Mehta. Institute for Human Development New Delhi

LAST SECTION!!! 1 / 36

Contents Part I Descriptive Statistics 1 Introduction and Framework Population, Sample, and Observations Variables Quali

MEASURING PORTFOLIO RISKS USING CONDITIONAL COPULA-AR-GARCH MODEL

Lecture 2. Vladimir Asriyan and John Mondragon. September 14, UC Berkeley

Case Study: Heavy-Tailed Distribution and Reinsurance Rate-making

Probability Weighted Moments. Andrew Smith

Probability. An intro for calculus students P= Figure 1: A normal integral

KARACHI UNIVERSITY BUSINESS SCHOOL UNIVERSITY OF KARACHI BS (BBA) VI

Pakistan Export Earnings -Analysis

Basic Procedure for Histograms

Approximate Variance-Stabilizing Transformations for Gene-Expression Microarray Data

Modelling catastrophic risk in international equity markets: An extreme value approach. JOHN COTTER University College Dublin

ECONOMETRIC SCALES OF EQUIVALENCE, THEIR IMPLEMENTATIONS IN ALBANIA

the display, exploration and transformation of the data are demonstrated and biases typically encountered are highlighted.

Edgeworth Binomial Trees

Income Inequality Measurement in Greece and Alternative Data Sources:

1. You are given the following information about a stationary AR(2) model:

Fundamentals of Statistics

Topic 11: Measuring Inequality and Poverty

Continuous Distributions

Analysis of truncated data with application to the operational risk estimation

- International Scientific Journal about Simulation Volume: Issue: 2 Pages: ISSN

Some Characteristics of Data

Asset Allocation Model with Tail Risk Parity

TRENDS IN INCOME DISTRIBUTION

KURTOSIS OF THE LOGISTIC-EXPONENTIAL SURVIVAL DISTRIBUTION

Probability and distributions

INCOME DISTRIBUTION DATA REVIEW ESTONIA

WEEK 7 INCOME DISTRIBUTION & QUALITY OF LIFE

An Application of Extreme Value Theory for Measuring Financial Risk in the Uruguayan Pension Fund 1

NOTES ON THE BANK OF ENGLAND OPTION IMPLIED PROBABILITY DENSITY FUNCTIONS

Pareto Models, Top Incomes, and Recent Trends in UK Income Inequality

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

PROBLEMS OF WORLD AGRICULTURE

Financial Econometrics (FinMetrics04) Time-series Statistics Concepts Exploratory Data Analysis Testing for Normality Empirical VaR

Lecture 5: Fundamentals of Statistical Analysis and Distributions Derived from Normal Distributions

Basic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 4: May 2, Abstract

DATA SUMMARIZATION AND VISUALIZATION

Economics 448: Lecture 14 Measures of Inequality

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1

Stochastic model of flow duration curves for selected rivers in Bangladesh

UPDATED IAA EDUCATION SYLLABUS

SKEWNESS AND KURTOSIS PROPERTIES OF INCOME DISTRIBUTION MODELS. Jeff Sorensen. and

Statistical Modeling Techniques for Reserve Ranges: A Simulation Approach

The mean-variance portfolio choice framework and its generalizations

Modelling Environmental Extremes

REINSURANCE RATE-MAKING WITH PARAMETRIC AND NON-PARAMETRIC MODELS

Parametric modelling of income distribution in Central and Eastern Europe

Social Situation Monitor - Glossary

DEPARTMENT OF ECONOMICS

Modelling Environmental Extremes

AIM-AP. Accurate Income Measurement for the Assessment of Public Policies. Citizens and Governance in a Knowledge-based Society

GENERATION OF STANDARD NORMAL RANDOM NUMBERS. Naveen Kumar Boiroju and M. Krishna Reddy

Statistical Size Distributions in Economics and Actuarial Sciences. Wiley Series in Probability and Statistics

The Distributions of Income and Consumption. Risk: Evidence from Norwegian Registry Data

MULTISTAGE PORTFOLIO OPTIMIZATION AS A STOCHASTIC OPTIMAL CONTROL PROBLEM

Cambridge University Press Risk Modelling in General Insurance: From Principles to Practice Roger J. Gray and Susan M.

Frequency Distribution and Summary Statistics

Comparison of Estimation For Conditional Value at Risk

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

Heterogeneity in Returns to Wealth and the Measurement of Wealth Inequality 1

On Some Test Statistics for Testing the Population Skewness and Kurtosis: An Empirical Study

On Some Statistics for Testing the Skewness in a Population: An. Empirical Study

TABLE OF CONTENTS - VOLUME 2

Incorporating Equity Metrics into Regulatory Review. SRA/RFF Conference, June 2009 Matthew D. Adler, University of Pennsylvania Law School

Monte Carlo Simulation (General Simulation Models)

EVA Tutorial #1 BLOCK MAXIMA APPROACH IN HYDROLOGIC/CLIMATE APPLICATIONS. Rick Katz

Power of t-test for Simple Linear Regression Model with Non-normal Error Distribution: A Quantile Function Distribution Approach

ECON 450 Development Economics

STAT 157 HW1 Solutions

Volume 30, Issue 1. Stochastic Dominance, Poverty and the Treatment Effect Curve. Paolo Verme University of Torino

Section B: Risk Measures. Value-at-Risk, Jorion

CH 5 Normal Probability Distributions Properties of the Normal Distribution

The histogram should resemble the uniform density, the mean should be close to 0.5, and the standard deviation should be close to 1/ 12 =

Chapter 3 Statistical Quality Control, 7th Edition by Douglas C. Montgomery. Copyright (c) 2013 John Wiley & Sons, Inc.

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

Asymmetric fan chart a graphical representation of the inflation prediction risk

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -5 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc.

We will also use this topic to help you see how the standard deviation might be useful for distributions which are normally distributed.

Transcription:

MODELLING INCOME DISTRIBUTION IN SLOVAKIA Alena Tartaľová Abstract The paper presents an estimation of income distribution with application for Slovak household s income. The two functions most often used are the Pareto and the lognormal. The Pareto function fits the data fairly well towards the higher levels but the fit is poor towards the low income levels. The lognormal fits the lower income levels better but its fit towards the upper end is far from satisfactory. We described less known models of incomes - Dagum and Singh-Maddala distribution. The considered distributions are used to fit data about Slovak household s income. The distributions fits actual data remarkably well compared with the Pareto and the lognormal. We used three concepts of income definition. We compare total disposable income, total equivalised income and income per capita and show that different definitions of household s incomes leads to different estimates of income distribution and inequality indices. Key words: Income distribution, inequality indices, total disposable income, total equivalised income, income per capita JEL Code: C46; D31; D63 Introduction Analysis of income distributions is useful tool for decisions in various fields of social politics and it is crucial in estimation of household s consumption. In analysis of income distribution usually different concept is used. We will discuss the different possible definitions of incomes. First, important is what the income unit is. The income unit could be the person, the nuclear family and the household. We will analyze the data from a survey of income and living conditions of households called EU-SILC in which the household is defined as the group of people living together at the same address with common housekeeping. Analysis about incomes usually doesn t bear in mind their size. A very simple way is to obtain income per capita, but according to Coulter (1992) there exists several disadvantages of this approach. The second approach is based on the weighting the household s income by a scale rate and obtaining the equivalised income. The scales used in 1101

EU-SILC data are the OECD scales (or Oxford scales), but the scales and their calculation is subject for discussions and there is no general agreement about which equivalence to use. In this paper we will compare models of household s income using total disposable income, equivalised income and income per capita. Modeling of income distribution is to find a suitable probability model. From the obtained probability distribution we could estimate basic characteristics and find the quantiles for the lowest and for the highest income. In Section 1 we describe different types of functional forms of income distribution (see Kleiber and Kotz, 2003) and in the Section 2 we proposed these models for Slovak income data. All the calculations were executed by means of freeware R available on the internet (http://cran.rproject.org/). 1 Models of Income Distribution The study of income distribution has a long history. The probability modelling of income distribution started with the work of Italian economist Vilfredo Pareto in 1897 and his work Cours d economie politique. He described a principle which states that for many events; roughly 80% of the effects come from 20% of the causes. The original observation was in connection with population wealth. Pareto noticed that 80% of Italy s land was owned by 20% of the population. He carried out several surveys on a variety of other countries and found a similar distribution. This is nowadays known as a Pareto law. Since the work of Pareto distribution a large number of models have been introduced to describe the distribution of incomes. Distributions of incomes are usually positively skewed with a long right tail and high density at the lowest percentiles. In order to identify the suitable model of income distribution kernel estimates are used (see Tartaľová, 2010). The most frequently used in practise are Pareto and lognormal distribution. Less known are Dagum and Singh-Maddala distribution, but we will show that they have also convenient properties for fitting income distribution. Pareto distribution The importance of Pareto distribution in study of income distributions is due to his good fit to empirical data. However, Pareto distribution usually poses better fit for the largest and for the smallest incomes and it is not useful as a model for the whole data. We can find 1102

various form of Pareto distribution, there are European and American version, so one should known which version is used due to interpretation of parameters. We will use definition (1), which is probability density function defined in statistical programme R. A random variable X follows a Pareto distribution, if his probability density function is f k. x k1 X ( x) k1, (1) Where α is location parameter and k is shape parameter. Lognormal distribution Lognormal distribution is convenient for modelling not only because parameters of distribution has clear economic interpretation. Parameter µ is the logarithm of the geometric mean income and σ 2 is the variance of the logarithm of income and one of the simple inequality measures, the larger σ 2, the larger the inequality measure. Two-parametric lognormal distribution fits well part of middle income range, but gives a poor fit at the tails. A random variable X follows a lognormal distribution, if his probability density function is 2 1 (ln x ) f X ( x) exp 2 x 2 2, (2) The appropriateness of this distribution from various points of view is discussed for example in Kleiber, Kotz (2003). Dagum distribution Camilo Dagum (in the 1970) was not satisfied with the classical statistical distributions used to summarize income data, such as Pareto or lognormal distribution. He developed distribution, named Dagum, based on log-logistic distribution (if p=1, then it is Burr distribution) by adding another parameter. A random variable X follows a lognormal distribution, if his probability density function is f p1 X ( x) p1 p px x 1 Where β is the scale parameter, α and p are shape parameters., (3) 1103

Singh-Maddala distribution Singh and Maddala (1976) propose a justification of the old Burr XII distribution by considering the log survival function as a richer function of x than what the Pareto does. f 1 X ( x) p1 p px x 1, (4) Dagum and Singh-Maddala distributions are closely related (see Kleiber, 1996) 1 1 X ~ D(,, p) ~ SM,, p, (5) X This relationship permits to translate several results pertaining to the Singh-Maddala distribution into corresponding results for the Dagum distribution. For analyzing and visualizing income inequality are several indexes used. In this article we will discuss about Gini coefficient, Atkinson and Theil s index. For visualizing income inequality is the Lorenz curve used. Gini coefficient The Gini coefficient is one of the most commonly used indicators of income inequality. The Gini coefficient is usually defined mathematically based on the Lorenz curve, which plots the proportion of the total income of the population (y axis) that is cumulatively earned by the bottom x% of the population (see diagram). 1 I G 1 2 L( x) dx, (6) where L(x) is Lorenz curve. An estimator of the population Gini coefficient is 0 I G n i. xi 2 i1 1 n.(7) n n 1 xi i1 For known function of income distribution with cumulative distribution function F, Gini coefficient can be calculated as 1104

I G 1 2 1 1 1 F( x) dx F( x) 1 F( x) dx, (8) 0 0 where µ=e(x). Atkinson index The Atkinson Index is one of the few inequality measures that explicitly incorporates normative judgments about social welfare (Atkinson 1970). The index is derived by calculating the so-called equity-sensitive average income (ye), which is defined as that level of per capita income which if enjoyed by everybody would make total welfare exactly equal to the total welfare generated by the actual income distribution. The equity-sensitive average income is given by: 1 1 x I A ( ) 1 F( x) dx, 0, (9) 0 where µ=e(x) and ε is the parameter that controls inequality aversions. Theil s index Theil s index is computed as an expectation taking the estimated parameters. A measure of inequality proposed by Theil (1967) derives from the notion of entropy in information theory. The index has a potential range from zero to infinity, with lower values (greater entropy) indicating more equal distribution of income. 1 I T x x E log, (10) where µ=e(x). An estimator of the population Theil s index is: I T 1 n n i1 xi xi log, (11) One way to choose between the large numbers of inequality indices available is to evaluate them in terms of their properties. 2 Application to Slovak Data Sample surveys of household s income in the Slovakia are made by the Statistical Office. After the entrance to the European Union they annually make a survey of income and living 1105

conditions of households called EU-SILC. In this dataset are several variables using for analysis. In many published articles Sipková (2004), Sipková and Sipko (2010), Želinský, (2010) as an income unit is total disposable income or equivalised income considered. The definitions for the analyzed concept of incomes are: Total disposable household income (variable HY020) is calculated as the sum of the components of gross personal income of all household members plus gross income components at household level (e.g. social transfers). The equivalised disposable income (variable HX100) is the total income of a household, after tax and other deductions, that is available for spending or saving, divided by the number of household members converted into equalised adults; household members are equalised or made equivalent by weighting each according to their age, using the so-called modified OECD equivalence scale. This scale attributes a weight to all members of the household: 1.0 to the first adult; 0.5 to the second and each subsequent person aged 14 and over; 0.3 to each child aged under 14. The equivalent size is the sum of the weights of all the members of a given household. Total disposable income per capita (variable HY020/variable HX070) which is total disposable household income divided by the number of members of households. Figure 1. Histogram and characteristics of total disposable income Total income Count 5256 Average 12127,9 Standard deviation 7869,4 Coeff. of variation 64,89% Minimum 42,3222 Maximum 78431,7 Range 78389,4 Skewness 1,7914 Kurtosis 6,24563 Characteristics of the samples of Slovak household s incomes in the year 2009 are presented in Figures 1.-3. The units are in Euros. There are 5256 observations. The differences between three concepts of incomes are apparent from basic characteristics. Average total household income obviously increases with household size, whereas average of per capita household 1106

income generally decreases. Results show that total income has the highest variability (coeff. Of variation is 64,89%) and the equivalised income has the lowest variability with coefficient of variation 50.89 %.The histogram of incomes reveals right skewed distribution with extreme values on the right tail. We could suppose that distributions we considered in Section 1 are suitable for empirical data. Figure 2. Histogram and characteristics of total equivalised income Equivalised Income Count 5256 Average 6090,63 Standard deviation 3099,43 Coeff. of variation 50,89% Minimum 42,3222 Maximum 62517,9 Range 62475,6 Skewness 3,21233 Kurtosis 30,3214 Figure 3. Histogram and characteristics of income per capita Income Per Capita Count 5256 Average 4282,42 Standard deviation 2357,14 Coeff. of variation 55,04% Minimum 42,3222 Maximum 62517,9 Range 62475,6 Skewness 5,51232 Kurtosis 89,7393 1107

To indicate the best possible model for distribution of incomes we start with two most common used models: Pareto and lognormal distribution. We have studied also less known Dagum and Singh-Maddala distribution. The parameters of models were estimated using maximum likelihood techniques in programme R (see Table 1). We performed goodness of fit tests; the results show that among examined models we can accept Dagum and Singh- Maddala distribution. From the plots comparing estimated distribution we can see, that Dagum and Singh-Maddala distribution fit the data very good at the whole range (see Figure 4. for the lack of space there is only plot for variable total income). Tab. 1: Results of estimation of parameters to Slovak household s incomes in 2009 Model Total income Equivalised Income Income Per Capita Pareto α 0,18 0,21 0,22 k 42,32 42,32 42,32 Lognormal µ 9,20 8,61 8,25 σ 0,66 0,48 0,49 Dagum α 3,25 4,26 4,52 β 12983,08 5829,24 4347,18 p 0,63 0,85 0,72 Singh Maddala α 2,19 3,93 3,71 β 16735,10 5624,65 4181,13 p 2,25 1,06 1,22 Figure 4: Dagum (red line) and Singh-Maddala (black line) distribution fitted to the Slovak household s total incomes 1108

Another picture of income distribution could be given by computing inequality indexes. We choose Gini, Atkinson and Theil index and compare results for three different definitions of incomes. The differences between the Gini indexes being quite large, the largest is for the Income per capita. According to Atkinson and Theil index the largest inequality in income is found for total income. Tab.2: Inequality measures of Slovak household s incomes in 2009 Inequality measure Total Income Equivalised Income Income Per Capita Gini Index 0,250 0,247 0,337 Atkinson Index 0,092 0,052 0,055 Theil's Index 0,186 0,107 0,115 Conclusion This paper contains analysis of incomes of Slovak households in the year 2009. The analysis is based on the sample of 5256 observations from survey of income and living conditions of households called EU-SILC. We point out that concept of income definition leads to different results. From EU-SILC data three different definitions of income can be used. In this paper we have concentrated on total income, equivalised income scaled with OECD scale and income per capita scaled with the number of people in the household. We fit income data by two commonly used models Pareto and lognormal distribution. We also introduced less known Dagum nad Singh-Maddala distribution and show that present also suitable model with good fit at the whole range. We compare fitted model for three series of data and obtain different estimates for income distribution and inequality measure. The study shows that the estimation method using per capita income and total income resulted to a higher estimate of poverty incidence in the country than for equivalised income. There is no general agreement which definition to use, but we would like to stir up discussions about it. Another topic for further research and discussion are the scales used to compute equivalised income. Acknowledgment This work was supported by the Slovak Scientific Grant Agency as part of the research project VEGA 1/0127/11 Spatial Distribution of Poverty in the European Union. 1109

References Chotikapanich, D. (2008). Modelling Income Distribution and Lorenz Curves, Springer Science and Business Media LLC Coulter, F., Cowell, F. And Jenkins, S. (1992). Differences in needs and assessment of income distributions. Bulletin of Economic Research 44, 77-124 Kleiber, Ch. and Kotz, S. (2003). Statistical Size Distributions in Economics and Actuarial Science, John Wiley&Sons, Inc., Hoboken, New Jersey Kleiber Ch. (1996). Dagum vs. Singh-Maddala income distributions. Economics Letters 53, 265-268 Pacáková, V., Sipková, Ľ. and Sodomová, E. (2004) Statistics modelling of household's incomes in the Slovak Republic, Journal of Economics 53, 427-439 Sipková, Ľ. and Sipko J. (2010) Wage levels in the regions of the Slovak Republic. SOCIALNY KAPITAL, LUDSKY KAPITAL A CHUDOBA V REGIONOCH SLOVENSKA: SCIENTIFIC CONFERENCE PROCEEDINGS, 51-66 ŠÚ SR. 2010. EU SILC 2008, UDB verzia 26/07/2010 [databáza s mikroúdajmi]. Bratislava: Štatistický úrad SR, 2010. Tartaľová, A. (2010). Nonparametric estimation method of probability density function. Forum Statisticum Slovacum 5, 250-255 Victoria-Feser, M.P. and Alaiz M.P. (1996). Modelling Income Distribution in Spain: A Robust Parametric Approach. DARP Discussion Paper 20, London School of Economics Victoria-Feser, M.P. (2000). Robust methods for the analysis of income distribution, inequality and poverty. International Statistical Review 68 (3), 277-293 Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth Edition. Springer, New York. ISBN 0-387-95457-0 Želinský, T. (2010). Regions of Slovakia from the View of Poverty. SOCIALNY KAPITAL, LUDSKY KAPITAL A CHUDOBA V REGIONOCH SLOVENSKA: SCIENTIFIC CONFERENCE PROCEEDINGS, 37-50 Yee, T. (2012). VGAM: Vector Generalized Linear and Additive Models. R package version 0.8-7. URL http://cran.r-project.org/package=vgam Contact Alena Tartaľová, Mgr., PhD. Department of Applied Mathematics and Business Informatics Faculty of Economics, TU Kosice Nemcovej 32, 040 01 Kosice, Slovakia alena.tartalova@tuke.sk 1110