Eric Bonsang, Arthur van Soest. Satisfaction with job and income among older individuals across European countries RM/10/059

Similar documents
3/3/2014. CDS M Phil Econometrics. Vijayamohanan Pillai N. Truncated standard normal distribution for a = 0.5, 0, and 0.5. CDS Mphil Econometrics

MgtOp 215 Chapter 13 Dr. Ahn

EXTENSIVE VS. INTENSIVE MARGIN: CHANGING PERSPECTIVE ON THE EMPLOYMENT RATE. and Eliana Viviano (Bank of Italy)

CHAPTER 9 FUNCTIONAL FORMS OF REGRESSION MODELS

3: Central Limit Theorem, Systematic Errors

A Utilitarian Approach of the Rawls s Difference Principle

Spatial Variations in Covariates on Marriage and Marital Fertility: Geographically Weighted Regression Analyses in Japan

UNIVERSITY OF NOTTINGHAM

Notes are not permitted in this examination. Do not turn over until you are told to do so by the Invigilator.

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE)

OCR Statistics 1 Working with data. Section 2: Measures of location

Estimation of Wage Equations in Australia: Allowing for Censored Observations of Labour Supply *

THE VOLATILITY OF EQUITY MUTUAL FUND RETURNS

Can a Force Saving Policy Enhance the Future Happiness of the Society? A Survey study of the Mandatory Provident Fund (MPF) policy in Hong Kong

/ Computational Genomics. Normalization

Domestic Savings and International Capital Flows

II. Random Variables. Variable Types. Variables Map Outcomes to Numbers

A Comparison of Statistical Methods in Interrupted Time Series Analysis to Estimate an Intervention Effect

Evaluating Performance

2) In the medium-run/long-run, a decrease in the budget deficit will produce:

Tests for Two Correlations

Linear Combinations of Random Variables and Sampling (100 points)

Problems to be discussed at the 5 th seminar Suggested solutions

Labor Market Transitions in Peru

Elements of Economic Analysis II Lecture VI: Industry Supply

An Application of Alternative Weighting Matrix Collapsing Approaches for Improving Sample Estimates

Work, Offers, and Take-Up: Decomposing the Source of Recent Declines in Employer- Sponsored Insurance

Parental Time Restrictions and the Cost of Children: Insights from a Survey among Mothers

Mode is the value which occurs most frequency. The mode may not exist, and even if it does, it may not be unique.

Urban Effects on Participation and Wages: Are there Gender. Differences? 1

Tests for Two Ordered Categorical Variables

Forecasts in Times of Crises

The Analysis of Net Position Development and the Comparison with GDP Development for Selected Countries of European Union

occurrence of a larger storm than our culvert or bridge is barely capable of handling? (what is The main question is: What is the possibility of

The Integration of the Israel Labour Force Survey with the National Insurance File

Spurious Seasonal Patterns and Excess Smoothness in the BLS Local Area Unemployment Statistics

Networks in Finance and Marketing I

Lecture Note 2 Time Value of Money

Which of the following provides the most reasonable approximation to the least squares regression line? (a) y=50+10x (b) Y=50+x (d) Y=1+50x

Satisfaction with Job and Income among Older Individuals across European

Network Analytics in Finance

FORD MOTOR CREDIT COMPANY SUGGESTED ANSWERS. Richard M. Levich. New York University Stern School of Business. Revised, February 1999

International ejournals

Educational Loans and Attitudes towards Risk

- contrast so-called first-best outcome of Lindahl equilibrium with case of private provision through voluntary contributions of households

EDC Introduction

Problem Set 6 Finance 1,

University of Toronto November 9, 2006 ECO 209Y MACROECONOMIC THEORY. Term Test #1 L0101 L0201 L0401 L5101 MW MW 1-2 MW 2-3 W 6-8

University of Toronto November 9, 2006 ECO 209Y MACROECONOMIC THEORY. Term Test #1 L0101 L0201 L0401 L5101 MW MW 1-2 MW 2-3 W 6-8

Measures of Spread IQR and Deviation. For exam X, calculate the mean, median and mode. For exam Y, calculate the mean, median and mode.

Raising Food Prices and Welfare Change: A Simple Calibration. Xiaohua Yu

Social Cohesion and the Dynamics of Income in Four Countries

02_EBA2eSolutionsChapter2.pdf 02_EBA2e Case Soln Chapter2.pdf

Chapter 3 Student Lecture Notes 3-1

Final Exam. 7. (10 points) Please state whether each of the following statements is true or false. No explanation needed.

Facility Location Problem. Learning objectives. Antti Salonen Farzaneh Ahmadzadeh

Multifactor Term Structure Models

Random Variables. b 2.

Highlights of the Macroprudential Report for June 2018

Finite Math - Fall Section Future Value of an Annuity; Sinking Funds

Impacts of Population Aging on Economic Growth and Structure Change in China

Analysis of Variance and Design of Experiments-II

Information Flow and Recovering the. Estimating the Moments of. Normality of Asset Returns

Ch Rival Pure private goods (most retail goods) Non-Rival Impure public goods (internet service)

Money, Banking, and Financial Markets (Econ 353) Midterm Examination I June 27, Name Univ. Id #

Risk and Return: The Security Markets Line

Quiz on Deterministic part of course October 22, 2002

Trivial lump sum R5.1

Survey of Math: Chapter 22: Consumer Finance Borrowing Page 1

15-451/651: Design & Analysis of Algorithms January 22, 2019 Lecture #3: Amortized Analysis last changed: January 18, 2019

General Examination in Microeconomic Theory. Fall You have FOUR hours. 2. Answer all questions

Clearing Notice SIX x-clear Ltd

Finance 402: Problem Set 1 Solutions

Prospect Theory and Asset Prices

Capability Analysis. Chapter 255. Introduction. Capability Analysis

Consumption Based Asset Pricing

A Bootstrap Confidence Limit for Process Capability Indices

4. Greek Letters, Value-at-Risk

Price and Quantity Competition Revisited. Abstract

Teaching Note on Factor Model with a View --- A tutorial. This version: May 15, Prepared by Zhi Da *

Financial mathematics

R Square Measure of Stock Synchronicity

Hybrid Tail Risk and Expected Stock Returns: When Does the Tail Wag the Dog?

Real Exchange Rate Fluctuations, Wage Stickiness and Markup Adjustments

Members not eligible for this option

>1 indicates country i has a comparative advantage in production of j; the greater the index, the stronger the advantage. RCA 1 ij

Political Economy and Trade Policy

On the Style Switching Behavior of Mutual Fund Managers

Welfare Aspects in the Realignment of Commercial Framework. between Japan and China

Microeconomics: BSc Year One Extending Choice Theory

Chapter 10 Making Choices: The Method, MARR, and Multiple Attributes

Appendix - Normally Distributed Admissible Choices are Optimal

ECON 4921: Lecture 12. Jon Fiva, 2009

Eliciting Risk Preferences: A Field Experiment on a Sample of French Farmers 1

Chapter 3 Descriptive Statistics: Numerical Measures Part B

Module Contact: Dr P Moffatt, ECO Copyright of the University of East Anglia Version 2

Explaining and Comparing

Heterogeneity in Expectations, Risk Tolerance, and Household Stock Shares

Monetary Tightening Cycles and the Predictability of Economic Activity. by Tobias Adrian and Arturo Estrella * October 2006.

Members not eligible for this option

Transcription:

Erc Bonsang, Arthur van Soest Satsfacton wth job and ncome among older ndvduals across European countres RM/10/059

Satsfacton wth job and ncome among older ndvduals across European countres 1 Abstract Erc Bonsang 2 and Arthur van Soest 3, Ths verson: October 2010 Usng data on ndvduals of age 50 and older from 11 European countres, we analyze two economc aspects of subjectve well-beng of older Europeans: satsfacton wth household ncome, and job satsfacton. Both have been shown to contrbute substantally to overall well-beng (satsfacton wth lfe or happness). We use anchorng vgnettes to correct for potental dfferences n response scales across countres. The results hghlght a large varaton n self-reported ncome satsfacton, whch s partly explaned by dfferences n response scales. When dfferences n response scales are elmnated, the cross country dfferences are qute well n lne wth dfferences n an objectve measure of purchasng power of household ncome. There are common features n the response scale dfferences n job satsfacton and ncome satsfacton. French respondents tend to be crtcal n both assessments, whle Dansh and Dutch respondents are always on the optmstc end of the spectrum. Moreover, correctng for response scale dfferences decreases the cross-country assocaton between satsfacton wth ncome and job satsfacton among workers. Key words: anchorng vgnettes, response scale dfferences, ageng JEL codes: I30, J30 1 We are grateful to two anonymous referees, Teresa Bago d Uva, Dder Fouarge, Hendrk Juerges, Raymond Montzaan, and partcpants of the fnal COMPARE conference n Brussels for useful comments. Ths paper was wrtten as part of the project COMPARE, funded by the European Commsson through ts 6 th framework (project number CIT5-CT-2005-028857). Data collecton and nfrastructure for makng data avalable to researchers was manly funded by the European Commsson through several SHARE related projects n the 5th and 6th framework programmes (CIT5-CT-2005-028857, QLK6-CT-2001-00360; RII-CT- 2006-062193). Addtonal fundng was provded by the US Natonal Insttute on Agng (grant numbers U01 AG09740-13S2; P01 AG005842; P01 AG08291; P30 AG12815; Y1-AG-4553-01; OGHA 04-064; R21 AG025169) and varous natonal sources (see http://www.share-project.org for a full lst of fundng nsttutons). 2 ROA, Maastrcht Unversty. 3 Netspar, Tlburg Unversty. 1

1. Introducton Labour market and lvng condtons of older ndvduals have become key polcy ssues n all European countres. Poverty s more prevalent among the elderly than among other age groups, partcularly n several Southern European countres (Tsakoglou, 1996). Lack of economc resources makes elderly people vulnerable to poor qualty of lfe (Grundy, 2006). Downward ncome moblty s larger among older age groups, partcularly among certan groups such as wdows and those wth an unemployment hstory, suggestng polces to strengthen the socal safety-net and to protect aganst unemployment and ts consequences for economc welfare (Zad et al., 2005). Populaton ageng has lead to more pressure on penson and old age beneft systems, and polces amed at ncreasng the labour force partcpaton of older ndvduals are requred n order to preserve the sustanablty of penson systems and old age socal securty. In order to desgn such polces, t s mportant to assess the determnants of retrement. Among the dfferent factors underlyng the retrement decson, job satsfacton plays an mportant role (Koslosk et al., 2001). Ths makes t partcularly relevant to study job satsfacton among older workers. In ths paper, usng data on ndvduals of age 50 and older from 11 European countres, we analyze two economc aspects of subjectve well-beng of older Europeans: satsfacton wth household ncome, and job satsfacton. Both have been shown to contrbute substantally to overall well-beng (satsfacton wth lfe or happness). For example, Ferrer-- Carbonell and Van Praag (2002) and Van Praag et al. (2003) analyze how satsfacton wth lfe of adult Germans s determned by satsfacton wth domans of lfe (satsfacton wth job, fnances, housng, health, lesure, and the envronment) and fnd that, together wth health satsfacton, job satsfacton and satsfacton wth the fnancal stuaton are the most mportant determnants. Smlarly large effects of fnancal and job satsfacton on satsfacton wth lfe are found for the UK by Van Praag and Ferrer--Carbonell (2008, p.91), though they fnd even larger effects of satsfactons wth lesure-use and socal lfe. Satsfacton wth household ncome has often been studed n the context of household equvalence scales; see, e.g., Van Praag and Van der Sar (1988), Van Praag and Warnaar (1997), Charler (2002), or Van Praag and Ferrer--Carbonell (2008, Chapter 2). The economc lterature on satsfacton wth lfe emphaszes the role of ncome (cf., e.g., Clark et al., 2008), but often analyzes the role of ncome for lfe satsfacton drectly, wthout consderng satsfacton wth ncome (see, for example, Schyns, 2002). A notable excepton s the work of Van Praag and co-authors (e.g., Van Praag et al., 2003) who ntroduced a twostage model where satsfacton wth lfe s a functon of satsfacton wth several domans, 2

ncludng satsfacton wth ncome or the fnancal stuaton, and where doman specfc satsfacton varables are determned by soco-economc characterstcs ncludng ncome. Van Praag and Ferrer--Carbonell also compare ncome satsfacton n several countres. Kapteyn et al. (2008) compare ncome satsfacton n the US and the Netherlands. We are not aware of studes that focus specfcally on ncome satsfacton of older populatons. Job satsfacton has tradtonally been studed n socology and psychology, but has more recently also been shown to provde useful nformaton about economc lfe that should not be gnored (Hamermesh, 1977; Freeman, 1978; Borjas, 1979; Clark and Oswald, 1996). For example, t appears to have predctve value for observable phenomena such as qut rates (Freeman, 1978; Clark et al., 1998) or absenteesm (Clegg, 1983). The determnants of job satsfacton have been studed extensvely for populatons of all adult workers; see, for example, Clark (1997), Clark et al. (1998), and Hamermesh (2001). Sousa-Poza and Sousa- Poza (2000) and Krstensen and Johansson (2008) compare job satsfacton and satsfacton wth varous job characterstcs across countres. We do not know of studes that focus specfcally on nternatonal comparsons of job satsfacton among older workers. An mportant ssue underlyng the cross-country comparson of self-reported wellbeng or satsfacton wth dfferent domans of lfe s that ndvduals from dfferent countres or soco-demographc backgrounds may use dfferent response scales, referred to as dfferental tem functonng (DIF) n the psychology lterature (Holland and Waner, 1993). Indeed, f ndvduals use the same scale, dfferences n self-reported satsfacton reflect true dfferences across countres or groups of ndvduals. However, f response scales dffer systematcally, adjustments are requred to compare true satsfacton across ndvduals. Van Praag et al. (2003) use panel data models wth (quas-)fxed effects, capturng persstent dfferences n response scales. Ths allows them to dentfy how changes n satsfacton respond to changes n characterstcs but does not help to dentfy cross-country dfferences n satsfacton levels that keep response scales constant. Specfcally for the latter purpose, Kng et al. (2004) have proposed to use anchorng vgnettes respondents are asked to evaluate hypothetcal stuatons descrbed n the survey queston. Ths addtonal nformaton helps to dentfy nterpersonal dfferences n response scales, even wth cross-secton data. Anchorng vgnettes have been used to analyze cross-country dfferences n varous subjectve measures of well-beng, such as poltcal effcacy (Kng et al., 2004), health (Salomon et al., 2004; Bago d Úva et al., 2008a,b), lfe satsfacton (Angeln et al., 2009, Kapteyn et al., 2010), or work dsablty (Kapteyn et al., 2007). Kapteyn et al. (2008) use anchorng vgnettes to compare ncome satsfacton between the Netherlands and US. They 3

fnd that the dstrbuton of self-reported ncome satsfacton dffers substantally across countres, but correctng for response scale dfferences makes the dstrbutons much more smlar. Krstensen and Johansson (2008) analyse the job satsfacton across seven European countres usng anchorng vgnettes and fnd evdences of cultural dfferences n reportng job satsfacton. They show that correctng for such dfferences alters the country rankng. The am of ths paper s to compare ncome and job satsfacton of older ndvduals (50+) across European countres correctng for dfferences n reportng styles of the respondent by usng anchorng vgnettes. The results of Bago d Uva et al. (2008b) and Kapteyn et al. (2007) suggest that dfferences n reportng styles across countres and socoeconomc groups are mportant for older age groups, though t s not clear whether they are systematcally larger or smaller than for younger age groups. The remander of ths paper s organzed as follows. Secton 2 presents the econometrc model and motvates the use of anchorng vgnettes. Secton 3 presents the data and descrptve statstcs. Estmaton results are presented n Secton 4. Secton 5 presents some smulatons of counterfactual dstrbutons, showng how ncome and job satsfacton compare across countres when response scales are kept constant. Secton 6 concludes. 2. The model The methodology of anchorng vgnettes to measure subjectve ordnal responses takng nto account dfferences n the reportng styles across ndvduals was frst ntroduced by Kng et al. (2004). We follow ther parametrc model, the so-called condtonal hopt (chopt) model. Defne a latent self-satsfacton varable ( s * ) as: s = X β + ε, (1) * where X s a vector of explanatory varables such as country dummes, gender, years of educaton, and household ncome, and β s a vector of parameters to be estmated. The error termε s assumed to be standard normally dstrbuted and ndependent of X. Reported satsfacton (s ) s an ordered categorcal varable based upon an underlyng latent varable s * : s = j f τ < s τ, j 1 * j j j If the thresholds between categores are the same for all respondents ( τ = τ for all,j) then ths gves the ordered probt model, a standard model for ordered response dependent varables. The man dstngushng feature compared to ths standard case s that all thresholds are allowed to vary wth observed respondent characterstcs: (2) 4

1 τ = X γ, 1 τ = τ j j 1 j + exp( X γ ), j = 2,3,4, j where theγ, j =1,2,3, 4, are vectors of parameters to be estmated. Wthout addtonal 1 nformaton, γ and β are not separately dentfed. Imposng γ 1 = 0 leads to a generalzed ordered probt model n whch the dstances between cut-off ponts are allowed to vary wth the characterstcs X ; the exponental functon s taken to guarantee that the dstances are always postve. We are partcularly nterested, however, n allowng for non-zeroγ 1, snce ths means that a change n the characterstcs leads to a parallel shft n all cut-off ponts, wth the ntuton that some respondents use more postve evaluatons than other respondents. To dentfyγ 1, addtonal nformaton s used n the form of vgnette evaluatons (3) k V (k=1,,k), where K s the number of dfferent vgnettes evaluated by the respondents. The vgnette equvalence assumpton mples that there exsts a common true (objectve) actual level of k satsfacton θ underlyng the stuaton descrbed by a gven vgnette k; the vector of all these s denoted by θ = ( θ 1,..., θ K ) The vgnette evaluatons are modelled as follows: where V = θ + ν, V j f V * k k k k j 1 * k j = τ < τ, k V s the evaluaton of vgnette k by respondent, and the (4) ν are errors, assumed to be 2 normally dstrbuted wth mean 0 and varance σ v, ndependent of each other, ε, and X. 4 The model consstng of equatons (1) (4) s estmated by maxmum lkelhood, combnng the nformaton n the self-assessments wth the nformaton n the vgnette evaluatons. The lkelhood contrbuton of a gven respondent conssts of a self-assessment part and a vgnette part: L( β, θ, γ s, V ) = L ( β, γ s) L ( θ, γ V ), (5) s where ( β, γ s) s the lkelhood component for the self-assessment: L s s N 4 = 1 j= 1 v j j 1 I ( s = j) [ ( τ X β) Φ(( τ X )] L ( β, γ s) = Π Π Φ β, (6) and ( θ, γ V ) s the lkelhood component for the vgnette part: L V k 4 k The assumpton that the ν are mutually ndependent may be too strong. Moreover, unobserved heterogenety n the thresholds may also lead to correlated vgnette evaluatons. Senstvty checks of Kapteyn et al. (2007) suggest that allowng for a rcher covarance structure of the errors s a statstcally sgnfcant mprovement but has no effect on the substantve results. 5

V N K 4 = 1 k= 1 j= 1 k I ( V = j) j k 2 j 1 k 2 [ ( τ θ, σ v ) Φ( τ θ, v )] L ( θ, γ V ) = Π Π Π Φ σ (7) The parameters 1 4 = (,..., ) drve both components of the lkelhood contrbutons, γ γ γ whch s why the addtonal nformaton n the vgnette evaluatons helps for dentfcaton. The man dentfyng assumptons n ths model are twofold. The frst s response j consstency: a gven respondent uses the same scales τ for self-reports and vgnettes. Kng et al. (2004) and Van Soest et al. (2007) have provded evdences supportng ths hypothess for vgnettes on vson and drnkng behavour, by comparng vgnette corrected self-reports and more objectve measures. The second assumpton s called vgnette equvalence : there should be no systematc dfferences n the nterpretaton of a gven vgnette between respondents wth dfferent characterstcs *k X (so that V does not vary wth X ). 3. Data and Descrptve Statstcs The emprcal analyss s based on data from the COMPARE sample whch s part of the second wave (2006-2007) of the Survey of Health, Ageng and Retrement n Europe (SHARE). SHARE ncludes rch nformaton about health, employment, fnancal stuaton, famly contacts, and socal actvtes of a representatve sample of the 50+ populatons n a number of European countres (Börsch-Supan et al., 2005, 2008). The COMPARE sample conssts of random subsamples of the complete SHARE samples n 11 countres. Respondents n these subsamples dd the complete face to face SHARE ntervew and then completed a drop-off questonnare wth self-assessed satsfacton wth varous domans of lfe and wth vgnette evaluatons for the same domans; see Van Soest (2008). SHARE respondents n the other subsamples got a completely dfferent drop-off questonnare. Response rates to the man survey and the drop-off were smlar for the COMPARE sample and the remanng SHARE sample. The COMPARE sample ncludes about 7000 ndvduals aged 50+ from eleven European countres: Belgum, Czech Republc, Denmark, France, Germany, Greece, Italy, the Netherlands, Poland, Span, and Sweden. Income satsfacton and anchorng vgnettes Objectve measures of economc poverty across countres are typcally based upon household ncome or household consumpton expendtures corrected for purchasng power dfferences and dfferences n household composton. Such measures however, are lkely to provde only a partal measure of poverty, snce whether people can make ends meet may also 6

depend on other factors such as access to cheap housng, avalablty of help from famly, frends, or neghbours, or the avalablty of free publc goods and servces such as health care. A more general assessment of lvng standard s the answer to the ncome satsfacton queston: How satsfed are you wth the total ncome of your household? Very dssatsfed/ Dssatsfed/ Nether satsfed, nor dssatsfed/ Satsfed/ Very satsfed The dstrbuton of ncome satsfacton among the aged 50+ ndvduals across countres s presented n Table 1. The rankng of the countres vares wth the chosen cut-off pont. For example, the percentage of satsfed/very satsfed ndvduals wth ther ncome s hgher n Span than n France, but the percentage of ndvduals beng very dssatsfed or dssatsfed ndvduals s slghtly lower n France than n Span. To compare the complete ncome satsfacton dstrbutons and nvestgate whether an unambguous rankng across subsets of countres can be obtaned, Fgure 1 s presented. It s based upon the numbers n Table 1 and compares the cumulatve dstrbuton of reported satsfacton wth ncome across countres by stackng percentages of each outcome. For example, the left hand bars ndcate that n Poland, 14% are very dssatsfed, 45% are very dssatsfed or dssatsfed, 77% are at very dssatsfed, dssatsfed, or nether satsfed or dssatsfed, etc. The countres are ranked on the bass of the latter percentages: Poland has the largest percentage at most nether satsfed or dssatsfed, and, correspondngly, the lowest percentage satsfed or very satsfed, so that Polsh respondents report the worst ncome satsfacton f we set the cut-off between nether satsfed or dssatsfed and satsfed. The graph shows, however, that Poland does worse than every other country whchever cut-off we use. For example, the percentage very dssatsfed or dssatsfed s hgher n Poland (45%) than n any other country. In other words, reported ncome satsfacton s unambguously worse n Poland than n all other countres. Such an unambguous rankng of pars of countres s not always possble. For example, f the cut-off s put between satsfed and nether satsfed nor dssatsfed, Span does better than France or the Czech Republc, but ths reverses f the cut-off s between dssatsfed and nether satsfed nor dssatsfed. The fgure also shows that Denmark, the Netherlands, and Sweden unambguously rank frst, second and thrd, respectvely, followed by Germany and Belgum. Fgure 2 compares ncome satsfacton and equvalent monthly household ncome by country, usng the modfed OECD equvalence scale (1+0.5*(adult-1)+0.3*chld, where 7

adult s the number of adult (15 years and older) n the household and chld s the number of chldren (at most 14 years old)). 5 Lke Table 1, ths fgure s based upon reported ncome satsfacton, and therefore does not take nto account the fact that ndvduals from dfferent countres may use dfferent response scales. The horzontal axs gves the country-specfc mean of equvalent monthly net household ncome corrected for PPP dfferences, whle the vertcal axs gves the percentage of ndvduals who are satsfed or very satsfed wth ther ncome. The fgure suggests a strong postve (and lnear) relatonshp between ncome and ncome satsfacton, except that France does not seem to ft ths relatonshp. Whle France has qute hgh household ncome, t performs poorly n terms of ncome satsfacton. Whle the subjectve ncome satsfacton measure has the advantage of encompassng many aspects of economc well-beng, t has the drawback that t may suffer from dfferental tem functonng (DIF): ndvduals n dfferent countres may use dfferent response scales and gve dfferent answers although they are economcally equally well off. Vgnettes descrbng hypothetcal people n gven economc crcumstances are used n order to correct for these response scale dfferences. In the COMPARE sample, the vgnette questons about ncome satsfacton are the followng: Vgnette 1: Jm s marred and has two chldren; the total after tax household ncome of hs famly s 1,500 per month. How satsfed do you thnk Jm s wth the total ncome of hs household? Very dssatsfed/ Dssatsfed/ Nether satsfed nor dssatsfed/ Satsfed/ Very satsfed Vgnette 2: Anne s marred and has two chldren; the total after tax household ncome of her famly s 3,000 per month. How satsfed do you thnk Anne s wth the total ncome of her household? Very dssatsfed/ Dssatsfed/ Nether satsfed nor dssatsfed/ Satsfed/ Very satsfed The amounts used for net household ncome n the above vgnettes,.e. 1,500 and 3,000, are the amounts used n the vgnette questons n France, Belgum and the Netherlands n whch purchasng power of one euro was almost dentcal. In other countres, 5 The equvalence scales are used n Fgures 2 and 4 only, and we therefore chose to use a smple equvalence scale common to all countres. Of course there are many alternatve equvalence scales, ncludng country specfc ones, as n, for example, Van Praag and van der Sar (1988). 8

PPP adjusted amounts were used n local currences. 6 The underlyng assumpton here, whch s necessary for vgnette equvalence, s that the lvng standard that ncome satsfacton s tryng to measure s not affected by the dstrbuton of ncome n the country of resdence. Ths dstrbuton may affect the answers to the ncome satsfacton queston, but only because t changes the socal norms and therefore the response scales, not because t makes someone genunely better or worse off. 7 The chosen amounts ( 1500 and 3000) place vgnettes 1 and 2 between the 20 th and 25 th and between the 70 th and 75 th percentles of the actual equvalzed ncome dstrbuton pooled over all countres. Because of the large cross-country dfferences n real ncomes, the country specfc postons vary from the lowest to the hghest decle. Tables 2 and 3 dsplay the dstrbuton of responses to the two vgnette questons by country. As expected, the ncome satsfacton assgned to Vgnette 1 s always much lower than for Vgnette 2. For both vgnettes, there are substantal dfferences across countres, pontng at systematc dfferences n response styles across European countres. For example, the low-ncome vgnette n Table 2 s rated as satsfactory or very satsfactory by about 61% of the older ndvduals n Poland and by only 12% n France or 11% n Sweden and by no one at all n Greece. The hgh-ncome vgnette n Table 3 s rated as very satsfed by, 52% of older ndvduals n Poland, compared to only 17% n France and 14% n Greece. Job satsfacton and anchorng vgnettes Job satsfacton s measured n the COMPARE survey by a sngle satsfacton queston asked to all respondents (ages 50 and over): How satsfed are you wth your daly actvtes (for example, your job, f you work)? Very dssatsfed/ Dssatsfed/ Nether satsfed, nor dssatsfed/ Satsfed/ Very satsfed For ths paper, we only consder the responses of 50-64 year old respondents who do pad work; satsfacton wth other daly actvtes s beyond the scope of the current study. Table 4 presents the frequency dstrbutons n each country. On average, older workers are satsfed 6 The amounts n vgnette 1 were 24,000CK n the Czech Republc, 14,200DK n Denmark, 1,550 n Germany, 1,200 n Greece, 1,450 n Italy, 3,300PZ n Poland, 1,300 n Span and 15,400SK n Sweden. The amounts n vgnette 2 were always twce as hgh. As ponted out by a referee, the dfferent degrees of roundng mght have effects on the responses, but we do not thnk ths s a major ssue. 7 Kapteyn et al. (2008) make the opposte assumpton that the lvng standard s purely relatve, and therefore use vgnettes wth multples country specfc medan ncomes. Whch assumpton s better seems to depend on the nterpretaton of the lvng standard concept one s tryng to measure. 9

wth ther job: 80% of the workers n the total sample report ether satsfed or very satsfed. The dfferences across countres are substantal, however. Fgure 3, constructed n the same way as Fgure 1, presents the cumulatve dstrbuton of job satsfacton by country. Once agan, the dstrbuton of Denmark domnates the dstrbuton of all other countres, followed by Sweden and the Netherlands. At the other end of the country rankng, we fnd Greece and France, where the proporton of ndvduals who are satsfed or very satsfed wth ther job s lowest, and the Czech Republc. Interestngly, the rankng of Poland depends crucally on the cut-off pont: lookng at the proporton of satsfed or very satsfed ndvduals, Poland does qute well and ranks fourth, but Poland s also the country wth the lowest proporton of very satsfed workers. Ths cross-country rankng n job satsfacton s largely consstent wth the nternatonal comparsons ncludng younger workers of Sousa-Poza and Sousa-Poza (2000) based on data on Work Orentatons from the 1997 Internatonal Socal Survey Program (ISSP) and Krstensen and Johansson (2008) from data collected n seven European countres n 2004. In lne wth our study, they fnd that Northern countres, especally the Danes, are the most satsfed wth ther job whle the French and Greeks rate ther job satsfacton qute low. To correct for potental dfferences n response scales n the job satsfacton assessments, each respondent younger than 65 years n the COMPARE sample also got two job satsfacton vgnettes, descrbng hypothetcal workers wth gven job characterstcs. 8 They are asked to rate the job satsfacton of these hypothetcal workers on the same scale used to measure ther own job satsfacton. The followng two vgnette questons are asked: Vgnette 1: Mke works full-tme, fve days per week; n prncple, he can organze hs work n hs own way but s stll often under a lot of pressure to meet deadlnes. He works for a bg company and feels that hs job s qute secure. How satsfed do you thnk Mke s wth hs job? Very dssatsfed/ Dssatsfed/ Nether satsfed, nor dssatsfed/ Satsfed/ Very satsfed Vgnette 2: Sally works four days per week and does not experence her job as stressful; she has lttle say over what she s dong, ths s decded by her boss. She feels t s a very secure job. How satsfed do you thnk Sally s wth her job? Very dssatsfed/ Dssatsfed/ Nether satsfed, nor dssatsfed/ Satsfed/ Very satsfed 8 Respondents of age 65 or older got vgnettes on other daly actvtes. 10

These vgnettes only descrbe a subset of all possble job characterstcs (hours of work, whether the job s stressful, control over actvtes, job securty) but not, for example, the wage. Ideally, vgnettes should be complete, but there s a trade off between beng as complete as possble and the drawbacks of long stores that many respondents wll not read serously. Whether the current vgnettes are suffcent remans a topc of future research. Tables 5 and 6 present the frequency dstrbutons of the job satsfacton vgnette assessments by country. The job n Vgnette 2 s seen as less satsfactory than the job n Vgnette 1. Dfferences across countres are agan substantal. Dansh respondents are qute postve about the frst vgnette n partcular (wth 78% evaluatng t as satsfed or very satsfed), whle Spansh respondents are very crtcal of ths vgnette (52% satsfed or very satsfed). On the other hand, the Swedes are partcularly crtcal about the job n Vgnette 2. Explanatory varables In addton to country dummes, the regressors n the econometrc model nclude soco-demographcs such as gender, age, martal status, years of educaton, dummes for employment status, and the logarthm of net household ncome last month, adjusted for PPP dfferences across countres. 9 We also nclude two health ndcators: the numbers of selfreported symptoms and chronc dseases. See Appendx, Table A1, for detals on varable defntons and sample statstcs. The latter reveal large dfferences across countres n many of the explanatory varables, ncludng those reflectng health or occupatonal status. The job satsfacton model also ncludes varables descrbng job condtons, such as workload, recognton, job securty, monthly net labour ncome and usual hours worked per week. Job condtons are measured by askng whether respondents strongly agree, agree, dsagree, or strongly dsagree wth the statements: My job s physcally demandng ; I am under constant tme pressure due to a heavy workload ; I have very lttle freedom to decde how I do my work ; I have an opportunty to develop new sklls ; I receve adequate support n dffcult stuatons ; I receve the recognton I deserve for my work ; My job promoton prospects/prospects for job advancement are poor ; My job securty s poor. For each statement, a dummy s created whch s equal to one ether when the respondent agrees or strongly dsagrees or when the respondent agrees or strongly agrees. See Appendx, Table A2 for detals and sample statstcs, agan showng large dfferences across countres. 9 Mssng household ncomes were mputed usng, among other varables, an alternatve measure of household ncome as one of the predctors. See Appendx for detals. 11

4. Estmaton Results Income satsfacton Table 7 presents the parameter estmates of the man equaton for the model wth dentcal thresholds for everyone (the baselne model, column (); these estmates are vrtually dentcal to those of a smple ordered probt model) and the estmates of the (condtonal) hopt model (column () to column (v)) takng account of dfferences n response scales (DIF). The results for the baselne model are n accordance wth most fndngs n the lterature. As expected, household ncome has a strong postve effect on ncome satsfacton, whle household sze has a substantal negatve effect. In terms of equvalence scales, the estmates mply that an ncrease n famly sze from one to two household members would requre an ncrease n household ncome of almost 29% to keep ncome satsfacton constant. 10 In other words, the estmated equvalence scale s 1.29. Ths result s comparable to the results of Van Praag and Van der Sar (1988, Table 3), whose results mply equvalence scales between 1.15 and 1.35 for eght out of nne countres (for Ireland, they fnd a much lower number). The estmate of Ferrer--Carbonell and Van Praag (2008, Table 3.1.4) for the UK s 1.31 - also very smlar to what we fnd. 11 Condtonal on ncome (and other covarates), hgher educated ndvduals are more satsfed wth ther ncome. Ths s consstent wth results of Kapteyn et al. (2008), who pont out t may be due to the fact that hgher educated people have hgher permanent ncome, or to the fact that our measure of ncome s mperfect so that educaton s a proxy for the devaton between self-reported ncome and actual ncome. The estmated effect of an addtonal year of educaton s about the same as the effect of a 2% rse n household ncome. Women tend to report hgher ncome satsfacton than men. Age has a postve effect, 12 whle poor health (number of symptoms and number of chronc dseases) reduces ncome satsfacton. Keepng other varables constant, we fnd no sgnfcant dfferences n ncome satsfacton between workers, retrees, or ndvduals recevng dsablty benefts, but 10 The formula to derve equvalence scale s the followng: 12 y N y = N β β hhsze hhncome. Where y N s the ncome that 1 a household wth N ndvduals should have to have the same ncome satsfacton as a sngle household wth an ncome of y 1 11 For Germany, Ferrer--Carbonell and van Praag (2008) present results for East and West and workers and nonworkers that mply equvalence scales rangng from 1.07 to 1.46. 12 Addng age squared (n all equatons) hardly mproved the ft and dd not change any of the substantve results. We therefore present the specfcaton wth a lnear age term only, whch s easer to nterpret.

unemployed ndvduals experence a sgnfcantly lower ncome satsfacton than workers, whle nactve persons are more satsfed than workers. Country dummes ndcate that, condtonal on ncome and other covarates, French respondents report the lowest ncome satsfacton level whle Dansh respondents report the hghest level. Interestngly, keepng the other covarates constant, Polsh respondents report about the same level of ncome satsfacton as German respondents. The fact that Polsh respondents report low ncome satsfacton (Table 1) s therefore manly explaned by the characterstcs of the Polsh respondents, partcularly ther low ncome and large famly sze. Allowng for DIF substantally modfes the estmates of the satsfacton equaton (column () n Table 7). The lkelhood-rato test strongly rejects the constraned model of no DIF aganst the more general model allowng for DIF (LR = 2256; 84 degrees of freedom) The coeffcent on household ncome s much hgher once we control for DIF, suggestng that ndvduals wth hgher ncome are more demandng they evaluate a gven ncome as less satsfactory than low ncome ndvduals wth the same other characterstcs. The effect of famly sze also ncreases, and ths approxmately compensates the ncreased ncome effect so that the equvalence scale does not change much compared to the baselne model - a two person household needs 32% more than a one person household accordng to the model wth DIF, compared to 29% n the baselne model. The effects of educaton and gender are also much hgher than n the baselne model. On the other hand, the effects of other socoeconomc varables (age, employment status, health) do not change much or even decrease. Many of the soco-economc characterstcs sgnfcantly affect the thresholds, partcularly the frst threshold (see column ()). The dfferences between effects on ncome satsfacton n the two models can be explaned by the effects of the same background varables on the thresholds. For example, ncome has a postve effect on the frst threshold, mplyng that hgher ncome respondents wll more often assess a gven ncome as very unsatsfactory. Ths s n lne wth the noton that hgher ncome makes people more demandng; see, for example, Van Praag and van der Sar, 1988, who fnd that the (stated) ncome requred to acheve a gven utlty level ncreases wth actual ncome. Our model specfcaton mples that a shft n the frst threshold also leads to a parallel shft n all other thresholds, and our estmates of the ncome coeffcents n γ 1, γ 2, γ 3 and γ 4 mply that hgher ncome respondents are more crtcal at all cut-off ponts, not only the frst. Thresholds also sgnfcantly depend on the country dummes. Italans, for example, uses hgher thresholds (.e., tend to gve more negatve assessments) than Germans throughout 13

the scale. As was already clear from Tables 2 and 3, Greek respondents tend to gve qute negatve vgnette evaluatons, translatng nto an unusually hgh frst threshold. As a consequence, the coeffcents on the country dummes n the ncome satsfacton equaton turn out to be qute dfferent n the hopt and the baselne model. Polsh respondents tend to evaluate the vgnettes qute postvely, and when ths s corrected for, they are worse off than respondents n any other country wth the same ncome and other characterstcs. The ranks of the Czech Republc, the Netherlands, and Germany also worsen substantally when correctng for DIF respondents n all these countres use relatvely optmstc evaluaton scales and are worse off when ths s corrected for. The opposte s found for Greek respondents: for gven ncome and other characterstcs, they are n 10 th place n the model wthout DIF, but correctng for ther very negatve evaluatons moves them to 2 nd place. Correctng for DIF also mproves the poston of Italy and Span (e.g., sgnfcantly better than Germany). Job satsfacton Table 8 presents the results for the ordered probt model (column () and ()) and the hopt model (column () and (v)) for job satsfacton among 50-64 year-old workers. The frst two columns show the results wthout takng nto account job condtons other than hours worked and earnngs, whle the last two columns add a rcher set of job characterstcs. 13 As for ncome satsfacton, a lkelhood-rato test strongly rejects the constraned model wthout DIF aganst the more general model allowng for DIF for both specfcatons (LR = 256.2; df=68) for the model wthout the set of job characterstcs (n ether equaton (1) or equaton (3)) and LR=302.0; df=100 for the specfcaton ncludng them (n equatons (1) and (3)) The ordered probt model suggests that, keepng ndvdual and job characterstcs constant, women report to be more satsfed wth ther job than men. Ths s n accordance wth many other studes on job satsfacton (Clark, 1997; Kaser, 2007). Once DIF s corrected for, however, the dfference between women and men s not sgnfcant anymore, suggestng that women report beng more satsfed wth ther job because they have dfferent response scales. A reason for ths may be that they have lower work expectatons than men and are therefore less demandng (Phelan, 1994). Age has a sgnfcant postve effect on job satsfacton n both models. Note also that the age effect may reflect a selecton process f less satsfed workers retre earler than more satsfed workers. Years of educaton has no sgnfcant effect on job satsfacton whchever 13 Estmates of the parameters determnng the thresholds are not presented to save space. They are avalable upon request from the authors. 14

model s consdered. Health symptoms have a sgnfcant negatve effect on job satsfacton n both models. Ther effect s lower when the larger set of job characterstcs s ncluded n the model, snce health problems are assocated wth unattractve job characterstcs. Hgher earnngs have a postve effect on job satsfacton, but ths effect s nsgnfcant when more job condtons are ncluded, suggestng that attractve job characterstcs (that are correlated wth hgh wages) are more mportant than the wage tself. The exstng lterature on job satsfacton and workng hours provdes mxed results. Whle Clark and Oswald (1996) fnd a negatve relatonshp between workng hours and job satsfacton, Drakopoulos and Theodossou (1997) fnd no sgnfcant effect. All our models suggest that, keepng monthly earnngs constant, there s no sgnfcant relaton between job satsfacton and workng hours of older workers n Europe. The fnal two columns show that most job characterstcs sgnfcantly affect job satsfacton wth the expected sgn. The magntudes of some of the coeffcents change when DIF s controlled for, but sgns and sgnfcance levels do not change much. A heavy workload has a negatve effect whle the opportunty to develop new sklls, recevng adequate support n dffcult stuatons, recognton for the job, job advancement opportuntes, and job securty all have a postve nfluence on job satsfacton. The largest mpact on overall job satsfacton comes from recognton for the job and from recevng support n dffcult stuatons. Opportuntes for developng new sklls and future job advancement are also mportant. Ths may seem surprsng gven the fact that the sample conssts of older workers who are approachng retrement age. Whether the job s physcally demandng and (n the hopt model) freedom at work have no sgnfcant effect. These results support the hypothess that non-pecunary job characterstcs are mportant for job satsfacton, confrmng fndngs for broader age groups (Clark, 2005; Skall et al., 2008). The coeffcents of the country dummes reflect ceters parbus dfferences between respectve countres and Germany, keepng constant ndvdual characterstcs and job characterstcs (earnngs and hours only n columns () and (), or the larger set of job characterstcs n columns () and (v)). Some of them are strongly sgnfcant and whch ones these are vares across the four model specfcatons. Correctng for dfferences n response scales manly affects the poston of Denmark, Sweden, and France. Compared to Germans, Dansh and French workers tend to use the more postve and more negatve responses, respectvely (cf. Table 6); once ths s taken nto account n the models wth DIF, ther job satsfacton levels are not sgnfcantly dfferent from those of German workers wth the same characterstcs. Swedsh workers evaluate a gven job more negatvely than German 15

workers (cf. Table 6) and when ths s corrected for n the models wth DIF, ther job satsfacton levels are actually hgher than those of smlar Germans. 14 In the fnal model n the last column of Table 8, the only countres whch are sgnfcantly dfferent from Germany are Greece and Sweden. In all other countres, keepng response scales, ndvdual characterstcs, and the rch set of job characterstcs constant, job satsfacton levels are not sgnfcantly dfferent from those n Germany. Greek workers are less satsfed than Germans wth smlar jobs. Only Swedsh workers are sgnfcantly more satsfed, possbly pontng at some attractve unobserved job characterstcs that are partcularly relevant n Sweden, such as a more postve atttude towards older workers than n other countres. Ths would be n lne wth Wadensjö (2006), who argues that Swedsh frms are wllng to share the responsblty of socety to ncrease employablty of older workers and sees ths as one of the explanatons of the success of the Swedsh partal retrement program. 5. Counterfactuals To understand the mplcatons of our approach we smulate the dstrbuton of ncome or job satsfacton n each country usng dfferent thresholds the thresholds that the average respondent n the benchmark country (Germany) 15 would use nstead of the actual thresholds used by the respondent. The latter smulaton (own thresholds) almost exactly 16 reproduces the observed dstrbuton of reported satsfacton levels n each country, presented n Tables 1 and 4 and Fgures 1 and 3. The smulaton of nterest however usng each country s own parameters n the satsfacton equaton but usng the threshold parameters for Germany produces a counterfactual dstrbuton wthout observatonal equvalent. Comparng these counterfactual smulatons across countres shows how much of the dfference between each country and the benchmark country remans when dfferences due to DIF are elmnated. Income satsfacton Fgure 4 s smlar to Fgure 2 but uses the counterfactual smulaton to construct the values along the vertcal axs. It presents, for each country, the proporton of ndvduals who 14 The changes n the country rankng when correctng for DIF can be compared wth results of Krstensen and Johansson (2008) for all workers. Smlar to what we fnd, they fnd that the rankng of France mproves, whle that of Denmark worsens. Dfferent from our fndngs, however, correctng for DIF substantally mproves job satsfacton n the Netherlands and Greece and worsens t n Span. Our other countres are not n ther data set. 15 For each respondent, we replace the thresholds by thresholds of the average German respondent (.e. wth the average ndvdual characterstcs of the German sample). 16 The ft s not exact due to fnte sample errors, smulaton errors, and, possbly, the fact that the model may not ft the data perfectly well. 16

would report beng satsfed or very satsfed wth ther ncome f they would use German benchmark thresholds. The horzontal axs gves the correspondng equvalent monthly household ncome, as n Fgure 2. Compared to Fgure 2, ncome satsfacton France s now much more n accordance wth ncome satsfacton n other countres wth a smlar ncome level. The low proporton of ndvduals reportng satsfed wth ther ncome n France that we saw n Fgure 2 apparently was partly due to DIF. Greece moves from a relatvely low satsfacton (gven ts actual ncome level) to a relatvely hgh satsfacton country. Correctng for response scale dfferences makes the dfference between Poland and the other countres even larger than before. All n all, the correcton brngs the rankng of the countres more n lne wth the rankng of ther ncome levels. The Spearman rank correlaton coeffcent s equal to 0.66 when DIF s taken nto account whle t s equal to 0.64 n the raw data; the Pearson correlaton coeffcent ncreases from 0.74 to 0.84 when we control for DIF. Fgure 5 presents the complete counterfactual cumulatve ncome satsfacton dstrbuton for all countres usng German benchmark thresholds. It confrms that correctng for DIF has mportant effects on the country rankng. Frst, the rankng between Sweden and the Netherlands s reversed a consequence of correctng for the fact that Swedsh respondents tend to assess vgnettes wth a gven ncome level more negatvely than Dutch respondents. Second, there s hardly any dfference left between Belgum, Italy and Germany once DIF s elmnated. As n Fgure 4, one of the most salent changes due to elmnatng DIF s France. Usng German scales, French respondents would be much more satsfed wth ther ncomes than ther actual reports (based upon the French scales) suggest, and France becomes an average country. As expected gven the estmaton results and Fgure 4, Greece does much better after the correcton than before correctng for DIF. Fnally, the cumulatve dstrbuton functon of ncome satsfacton n Span no longer crosses that of the Czech Republc. Span does unambguously better than the Czech Republc. Job satsfacton The counterfactual cumulatve dstrbutons of job satsfacton assumng that all ndvduals use the German benchmark thresholds are presented n Fgure 6. It s based upon the fnal model n Table 8 (column (v)), ncludng the rch set of job characterstcs. The country rankng dffers substantally from the one n Fgure 3. Once dfferences n response scales are elmnated, Sweden becomes the country wth the hghest level of job satsfacton, wth Denmark n second place, but at substantal dstance. Greece s the country wth worst job satsfacton n both fgures, but the dfference wth the other countres s much larger once 17

DIF s corrected for. As for ncome satsfacton, job satsfacton n France ncreases when German rather than French thresholds are used. Accountng for DIF reduces the cross-country assocaton between job and ncome satsfacton: the cross-country rank correlaton between country specfc percentages of workng respondents younger than 65 who are (at least) satsfed wth ther ncome and wth ther jobs decreases from 0.80 for reported satsfacton to 0.43 for the counterfactual rates usng the German thresholds. 17 An nterpretaton s that response scales n dfferent domans are postvely correlated: respondents who tend to gve negatve evaluatons n one doman wll often do the same n another doman. For example, French respondents assgn low satsfacton to the ncome vgnettes as well as the job satsfacton vgnettes compared to respondents n other countres. Ths llustrates that correctng for DIF may also be mportant to analyze the relaton between satsfacton levels n varous domans of lfe. 6. Concluson Ths paper analyses two mportant components of economc well-beng among the 50+ n 11 European countres: satsfacton wth household ncome and job satsfacton. The frst one s mportant n order to assess the overall economc welfare of the elderly. The results hghlght a large varaton n self-reported ncome satsfacton. The lowest s found n Poland and the hghest n Denmark. Dfferences across countres are partly explaned by dfferences n response scales. Once these dfferences are elmnated, the cross country dfferences are much better n lne wth dfferences n an objectve measure of purchasng power of household ncome. Correctng for dfferences n response scales also alters the rankng across countres. The most strkng change occurs for France, where respondents tend to use negatve assessments more often than n other countres. When DIF s taken nto account, the gap between Poland and the other countres wdens. An mportant motvaton for ths paper s that how a country compares to other countres n terms of lvng standard s an mportant nput for publc polcy on old age socal securty and pensons and combatng poverty and socal excluson among the older part of the populaton. We have shown that t matters whether the country comparson s done wth or wthout correctng for response scale dfferences (DIF). So should polcy makers use the cross-country comparson wth or wthout correctons for DIF? Under the assumptons that we have made, the answer s a clear yes: assumng dfferences n vgnette evaluatons purely 17 The Pearson correlaton decreases from 0.75 to 0.24. 18

reflects dfferences n the way terms lke very satsfed and not satsfed are used, correctng self-assessments for such dfferences seems a good thng f the am s to compare genune lvng standards. Ths leads to the concluson that lvng standard comparsons come much closer to objectve comparsons of equvalzed and PPP corrected average household ncomes than the subjectve ncome satsfacton reports would suggest. There s an alternatve nterpretaton of the dfferences n vgnette evaluatons, however. If, for example, goods are publcly provded (free of charge) n one country and not n another, or poor households can do more wth a gven ncome n one country than n another country, because of dfferences n, e.g., housng subsdes or health nsurance, then a gven ncome amount may lead to dfferent lvng standards n dfferent countres. In that case vgnette equvalence would not be satsfed and our correctons would take away genune dfferences n lvng standards. We do not thnk ths can explan much of our results for example the fact that French respondents gve negatve assessments would then suggest that the French get less publc support than smlar countres, whch seems mplausble. A smlar concluson s drawn by Kapteyn et al. (2008) on the bass of comparng evaluatons of vgnettes wth low and hgh ncomes. Moreover, the tendency to gve less postve evaluatons n France s also found for other subjectve well-beng measures such as lfe satsfacton (Angeln et al., 2009), further supportng the noton of cultural dfferences n thresholds. Older workers n Europe are generally satsfed wth ther jobs. Cross-country dfferences are not as large as for ncome satsfacton. Beng able to develop new sklls and havng job advancement opportuntes contrbute substantally to job satsfacton, though recognton for the job s the most mportant factor. Keepng job characterstcs as well as response scales constant, Swedsh workers are more satsfed than workers n all other countres consdered, possbly due to a more postve atttude of employers towards older workers n Sweden than elsewhere. Sweden remans the country where job satsfacton among older workers s hghest f cross-country varaton n job characterstcs s taken nto account and only the response scales are kept constant. The raw data, however, do not reveal ths, snce the actual job satsfacton reports are also affected by response scale varaton, leadng to lower reported satsfacton n Sweden and hgher satsfacton n Denmark, for example. Lke for ncome satsfacton, correctng for response scale dfferences changes the rankng of the countres. Now that fnancal ncentves for early retrement have been or are beng removed, and other factors lke job characterstcs and job satsfacton are ganng mportance for the decson to work longer, ths seems an mportant message for natonal polcy makers who compare the stuaton n ther own country to that n other countres. Whereas lookng at 19

the raw data would suggest that Denmark s the European role model for job satsfacton of older workers, Sweden becomes the best performng country when controllng for the Dansh tendency to use postve scales and the Swedsh tendency to be more negatve. There are common features n the response scale dfferences n job satsfacton and ncome satsfacton. French respondents tend to be crtcal n both assessments, whle Dansh and Dutch respondents are always on the optmstc end of the spectrum. The tendency to gve negatve evaluatons n France seems rather general; Angeln et al. (2009), for example, also fnd t for lfe satsfacton. As a consequence, correctng for DIF decreases the cross-country assocaton between average ncome and job satsfacton among workers younger than 65. The fact that correctng for DIF brngs subjectve and objectve evaluatons closer to each other can be seen as support for the valdty of the vgnettes approach as a tool for mprovng cross-country comparsons. It s n lne wth the fndng of Kng et al. (2004) that correctng for DIF usng anchorng vgnettes ncreases the cross-country correlaton between objectve and subjectve measures of health. Stll, more work s needed to test the valdty of the vgnette approach n the domans consdered and establsh the robustness of the results. The man underlyng assumptons are response consstency and vgnette equvalence, whch have been studed n other domans (e.g. Van Soest et al., 2007) but not for ncome and job satsfacton. Response consstency requres that respondents evaluate the hypothetcal stuatons on the same scale that they use to evaluate themselves; ths could be volated, for example, f self-assessments are affected by socal desrablty bas but vgnette evaluatons are not. We do not thnk ths s partcularly problematc n our case. Vgnette equvalence means that respondents n dfferent countres nterpret the vgnettes n the same way. As dscussed above, ths not an nnocuous assumpton, partcularly n the context ncome satsfacton, but we have also explaned why we thnk our results are not due to volaton of vgnette equvalence. Stll, valdatng the use of vgnettes and testng these assumptons remans an mportant ssue for future research. 20