Size-Dependent Tax Enforcement and Compliance

Similar documents
Size-dependent tax enforcement and compliance: global evidence

Effect of VAT Adoption On Manufacturing Firms in Ethiopia

Can Financial Frictions Explain China s Current Account Puzzle: A Firm Level Analysis (Preliminary)

External Financing and the Role of Financial Frictions over the Business Cycle: Measurement and Theory. November 7, 2014

Use of Imported Inputs and the Cost of Importing

Online Appendices for

Financial liberalization and the relationship-specificity of exports *

Capital allocation in Indian business groups

Adjustment Costs, Firm Responses, and Labor Supply Elasticities: Evidence from Danish Tax Records

Testing the predictions of the Solow model:

Hilary Hoynes UC Davis EC230. Taxes and the High Income Population

INTERMEDIATE MACROECONOMICS

Using Differences in Knowledge Across Neighborhoods to Uncover the Impacts of the EITC on Earnings

Taxable Income Elasticities. 131 Undergraduate Public Economics Emmanuel Saez UC Berkeley

Mis-Allocation in Industry

Deregulation and Firm Investment

Credit Constraints and Search Frictions in Consumer Credit Markets

THE WILLIAM DAVIDSON INSTITUTE AT THE UNIVERSITY OF MICHIGAN BUSINESS SCHOOL

Misallocation, Aggregate Productivity and Policy Constraints: Cross-country. Evidence in Manufacturing

The Elasticity of Corporate Taxable Income - Evidence from South Africa

The Persistent Effect of Temporary Affirmative Action: Online Appendix

Empirical Methods for Corporate Finance. Regression Discontinuity Design

Online Appendix (Not For Publication)

The Rise of the Middle Class and Economic Growth in ASEAN

What Firms Know. Mohammad Amin* World Bank. May 2008

Nonlinearities and Robustness in Growth Regressions Jenny Minier

Firing Costs, Employment and Misallocation

Serial Entrepreneurship and the Impact of Credit. Constraints of Economic Development

Finance, Firm Size, and Growth. Thorsten Beck Senior Economist Development Research Group World Bank

From imitation to innovation: Where is all that Chinese R&D going?

The Time Cost of Documents to Trade

Can Hedge Funds Time the Market?

Sarah K. Burns James P. Ziliak. November 2013

Testing the predictions of the Solow model: What do the data say?

Can Donor Coordination Solve the Aid Proliferation Problem?

Who Feeds the Trolls?

Firm Dynamics and Financial Development

Credit Misallocation During the Financial Crisis

Tax Cuts for Whom? Heterogeneous Effects of Income Tax Changes on Growth and Employment

Do Domestic Chinese Firms Benefit from Foreign Direct Investment?

Insider Trading and Innovation

Does Growth make us Happier? A New Look at the Easterlin Paradox

Discussion Reactions to Dividend Changes Conditional on Earnings Quality

FE670 Algorithmic Trading Strategies. Stevens Institute of Technology

Taxing Firms Facing Financial Frictions

Heterogeneity in Returns to Wealth and the Measurement of Wealth Inequality 1

Financial Liberalization and Neighbor Coordination

LABOR SUPPLY RESPONSES TO TAXES AND TRANSFERS: PART I (BASIC APPROACHES) Henrik Jacobsen Kleven London School of Economics

There is poverty convergence

Peer Effects in Retirement Decisions

Frequency of Price Adjustment and Pass-through

The Short- and Medium-Run Effects of Computerized VAT Invoices on Tax Revenues in China (Very Preliminary)

Cash holdings determinants in the Portuguese economy 1

The current study builds on previous research to estimate the regional gap in

Debt Financing and Survival of Firms in Malaysia

Fiscal Policy and Long-Term Growth

The Effects of Dollarization on Macroeconomic Stability

Credit Misallocation During the Financial Crisis

Inequality and GDP per capita: The Role of Initial Income

Procuring Firm Growth:

Real Estate Ownership by Non-Real Estate Firms: The Impact on Firm Returns

Taxation and International Migration of Superstars: Evidence from the European Football Market

Credit Allocation under Economic Stimulus: Evidence from China. Discussion

Firm Manipulation and Take-up Rate of a 30 Percent. Temporary Corporate Income Tax Cut in Vietnam

Learning Dynamics in Tax Bunching at the Kink: Evidence from Ecuador

Citation for published version (APA): Shehzad, C. T. (2009). Panel studies on bank risks and crises Groningen: University of Groningen

Quantifying the Impact of Financial Development on Economic Development

CARLETON ECONOMIC PAPERS

An Empirical Investigation of the Lease-Debt Relation in the Restaurant and Retail Industry

Managing Trade: Evidence from China and the US

Economic Growth and Convergence across the OIC Countries 1

Online Appendix. income and saving-consumption preferences in the context of dividend and interest income).

Deviations from Optimal Corporate Cash Holdings and the Valuation from a Shareholder s Perspective

US real interest rates and default risk in emerging economies

Input Tariffs, Speed of Contract Enforcement, and the Productivity of Firms in India

Macroeconometric Modeling (Session B) 7 July / 15

Distribution Costs & The Size of Indian Manufacturing Establishments

Foreign Fund Flows and Asset Prices: Evidence from the Indian Stock Market

The Margins of Global Sourcing: Theory and Evidence from U.S. Firms by Pol Antràs, Teresa C. Fort and Felix Tintelnot

A Toolkit for Informality Scenario Analysis: A User Guide

Contrarian Trades and Disposition Effect: Evidence from Online Trade Data. Abstract

Equity, Vacancy, and Time to Sale in Real Estate.

Efficient and Equitable Taxation. IGC Africa Growth Forum June 16, 2014

Stock price synchronicity and the role of analyst: Do analysts generate firm-specific vs. market-wide information?

Earnings Inequality and the Minimum Wage: Evidence from Brazil

Estimating Macroeconomic Models of Financial Crises: An Endogenous Regime-Switching Approach

Economics 689 Texas A&M University

Effects of Financial Support Programs for SMEs on Manufacturing Sector Productivity:

Entry Costs, Financial Frictions, and Cross-Country Differences in Income and TFP


Economic Development and the Margins of Trade: Are the Least Developed Countries Different?

Timing to the Statement: Understanding Fluctuations in Consumer Credit Use 1

Empirical appendix of Public Expenditure Distribution, Voting, and Growth

The Effect of House Prices on Household Borrowing: A New Approach *

The Impacts of State Tax Structure: A Panel Analysis

Macroeconomic impacts of limiting the tax deductibility of interest expenses of inbound companies

The Value of Unemployment Insurance

Behavioural insights and tax compliance: Evidence from large-scale field experiments in Belgium

Productivity and Misallocation in General Equilibrium by David Baqaee and Emmanuel Farhi

Global Retail Lending in the Aftermath of the US Financial Crisis: Distinguishing between Supply and Demand Effects

Transcription:

Policy Research Working Paper 8363 WPS8363 Size-Dependent Tax Enforcement and Compliance Global Evidence and Aggregate Implications Pierre Bachas Roberto N. Fattal Jaef Anders Jensen Public Disclosure Authorized Public Disclosure Authorized Public Disclosure Authorized Public Disclosure Authorized Development Research Group Macroeconomics and Growth Team March 2018

Policy Research Working Paper 8363 Abstract This paper studies the prevalence and consequences of size-dependent tax enforcement and compliance. The identification strategy uses the ranking of industries average firm size in the United States as an instrument for the size ranking of the same industries in developing countries. Data on 125,000 firms in 140 countries show that tax enforcement and compliance increase with size. Size-dependence is more prevalent in low-income countries, and concentrated at the top of the size distribution. When quantified in a general equilibrium model, removing size dependent enforcement leads to gains in Total Factor Productivity of up to 0.8 percent. This paper is a product of the Macroeconomics and Growth Team, Development Research Group. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://econ.worldbank.org. The authors may be contacted at pbachas@worldbank.org and rfattaljaef@worldbank.org The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team

Size-Dependent Tax Enforcement and Compliance: Global Evidence and Aggregate Implications Pierre Bachas, Roberto N. Fattal Jaef, and Anders Jensen JEL codes: H25, H26, O23, O43, D61 Pierre Bachas: World Bank Research, pbachas@worldbank.org - Roberto Fattal Jaef: World Bank Research, rfattaljaef@worldbank.org - Anders Jensen: Harvard Kennedy School and NBER, anders_jensen@hks.harvard.edu We are grateful to Alan Auerbach, Tim Besley, Michael Best, Natalie Cox, Tom Cunningham, Simon Galle, Roger Gordon, Michael Keen, Henrik Kleven, Camille Landais, Florian Misch, Ted Miguel, Torsten Persson, Andres Rodriguez-Clare, Emmanuel Saez, Johannes Spinnewijn, Munir Squires, Owen Zidar and seminar participants at the LSE, the 69th IIPF Congress and the 2014 ZEW Public Finance conference for valuable comments. Bachas gratefully acknowledges financial support from the Center for Equitable Growth and the Julis-Rabinowitz Center for Public Policy and Finance. Jensen gratefully acknowledges financial support from the Peter G. Foundation Grant No.16017 and the ESRC.

1 Introduction An influential literature explains cross-country differences in income and productivity through the misallocation of resources across firms (Restuccia and Rogerson 2008, Hsieh and Klenow 2009). One salient property of measured misallocation is its productivity dependence: the most productive firms are smaller and the least productive firms are larger than the output-maximizing allocation. This property of misallocation can emerge through government policies that target firm size; for example size-dependent labor regulations, tax rates or accounting requirements can discourage firm growth and impact the firm size distribution (Gollin 1995, Guner et al. 2008, Bento and Restuccia 2017). In this paper, we perform a quantitative evaluation of a specific feature of taxation which exerts a heterogeneous distortionary effect: size-dependent tax enforcement and compliance. We document the pervasiveness of this phenomenon around the world and provide estimates of the size-gradients across development levels and across the firm size distribution. We then characterize the implications of these size gradients for Total Factor Productivity in the context of a general equilibrium model of firm heterogeneity. Exploiting arguably exogenous variation in an industry s optimal labor scale, we find a robust positive slope of industry average firm-size, as measured by the number of employees, on tax inspection probability and compliance. This slope captures the average effect and masks non-linearities: inspection and compliance increase at the top of the size distribution and appear similarly lax among small and medium industries. Moreover, the magnitude of the gradient increases with development, from zero for rich countries to its highest value in the poorest economies. When feeding our estimated size-gradients into a general equilibrium model of firm dynamics, we find improvements in TFP of up to 0.8% when removing the size dependence component of taxation. These gains accrue both from the reversal of the misallocation as well as the subsequent increase in the firms expenses on innovation and growth. Tax enforcement is relevant for the study of policy-driven distortions for at least three 1

reasons. First, enforcement reduces the scope for tax evasion and therefore directly impacts firms effective tax rates. Second, many firms report facing large costs of dealing with the tax administration (World Bank 2017). 1 Third, reliance on size-dependent policies for tax enforcement has increased over time, as international institutions encouraged tax administrations to segment taxpayers (Kanbur and Keen 2014). To illustrate this trend, Figure 1 shows that over the past 20 years, more than 70 countries adopted special enforcement units for large taxpayers. While large taxpayer units hint at stringent enforcement at the top of the firm-size distribution, countries have also adopted enforcement policies targeted at small and medium firms. 2 Therefore how tax inspection and compliance vary with firm size and in countries around the world, is an empirical question. The empirical analysis uses the comprehensive World Bank Enterprise Surveys (WBES), which contain firm-level data on self-reported tax inspection and tax compliance 3 for a sample of 125,000 firms in 140 countries. Identification is based on the idea that firm size is partially determined by technological factors (Lucas 1969; Kremer 1993). These technological factors pin down firms optimal scale of operation (Bain 1954, Burnside 1996, Kumar et al. 1999, Basu and Fernald 2016). If firms in an industry share a common optimal scale across countries, then the relative ranking of two industries scale in a plausibly undistorted market such as the US can serve as an instrument for the relative ranking of the same two industries optimal scale in a developing country s distorted market. 4 We measure scale as the number of employees and take the average at three digit ISIC sectors. Identification relies on predicting the size ranking of industries at the 3-digit ISIC level in 1 The World Bank 2017 Doing Business publication reports that "on average taxpayers spend 25 hours complying with the requirements of an auditor, and go through several rounds of interactions during 10.6 weeks." 2 Many countries implemented small and medium taxpayer offices alongside the large taxpayer office, which tailored audit algorithms to the evasion risks specific to smaller firms. 3 The question on tax compliance is asked at the industry level. Since all our specifications are at the industry-country level, we do not require that the reported answers represent the firm s own tax compliance. Instead we only require that the answers accurately represent the industry s average compliance. 4 This idea follows Rajan and Zingales 1998, who studied if industries reliant on external finance grew faster in countries with more developed financial institutions. They instrument an industry s reliance on external finance with the external finance usage of the same sector in the US. 2

a WBES country from the size ranking in the US of the same ISIC3 industries. 5 The first stage relation, that is the conditional expectation of a WBES country s industry size rank across ranks of US industry size, is positive and linear. When repeated across subsamples of countries at different income levels, the slope remains constant: the US distribution has the same power to predict industry size ranking in Ethiopia, Indonesia or Brazil. 6 This suggests that the identification strategy captures technological differences in labor demand, which vary across industries but not across countries. The US Census industry size distribution is a valid source of exogenous variation under the assumption that US firms determine their size orthogonally to tax enforcement. This assumption might be reasonable in the US, where the Internal Revenue Service has access to comprehensive sources of third-party information and might not need to rely on imprecise size-proxies for tax enforcement. In contrast, tax authorities in developing countries with low fiscal capacity may be constrained to use size-proxies. We take the following steps to alleviate concerns with the identifying assumption. First, the estimates are robust when using the industry ranking based on European firms from Amadeus data, 7 instead of the US Census. These firms are subject to comprehensive financial reporting, and hence size-proxies might no longer be determinants of tax inspection. Second, since it has been documented that tax evasion is important among the self-employed in the U.S. (Blumenthal et al. 2001), we only consider firms with more than five employees in the US Census. 8 Third, for the few countries in the WBES with similar GDP as the US and thus similar tax enforcement capacity, we find no size gradient in tax inspection. 9 Finally we support the exclusion restriction by showing that the predicted WBES size-ranking is 5 Since our identification relies on industry comparisons of firm size, we do not use the within-industry size variation. In effect, this reduces our sample to 12,152 ISIC3-country-year observations. 6 This supports the following type of statement: If the average car manufacturer requires more workers than the average retail firm in the US, then this ranking of industries by size also holds in Ethiopia, Indonesia, and Brazil. 7 We use Amadeus data for German and British firms. Germany data derive from financial statements filed with the business registry. British data come from audited annual reports presented to shareholders. 8 The selection of firms with five or more employees also matches the sampling strategy of the WBES. 9 We cannot test this directly on the US since it does not have a World Bank Enterprise Survey. 3

not just proxying for competing channels that might drive tax inspection policies, namely capital intensity and reliance on external finance (Gordon and Li 2009). The IV estimated size-gradients imply that a 10 percentile increase in the WBES sizerank increases a firm s probability of tax inspection by 2.3% (a 3.8% increase relative to a mean of 61%) and its tax compliance rate by 2.2% (a 2.7% increase relative to a mean of 81%). We estimate two main control models which allow non-parametrically for the tax inspection function to vary over firm characteristics or over two digit ISIC industries in every country-year. The latter specification exploits size variation between narrow ISIC3 industries such as Manufacture of rubber products (category 251) compared to Manufacture of plastics products (category 252), within the ISIC2 category 25. In addition, we find comparable coefficients in panel models which exploit within a country variation in the ranking of 3-digit ISIC industry over time. The size gradients mask non-linearities over the size distribution. Inspection and compliance appear concentrated among large firms, but no different for small and medium firms. The symmetry of results on inspection and compliance suggests that size-based tax inspection may explain part of the compliance behavior. Finally, we study heterogeneity across countries income levels: the size-gradient appears to fall with development and the size-gradient for countries with similar income level as the United States is zero and statistically different from that of low-income countries. A decreasing reliance on size-dependent policies with development is consistent with evidence showing that countries with weak fiscal capacity rely on production-inefficient tax instruments (Best et al. 2015, Bachas and Soto 2016). In order to quantitatively evaluate the macroeconomic implications of the estimated size-gradients, we appeal to a standard general equilibrium model of firm dynamics. Our closest reference in the literature is a closed economy version of Atkeson and Burstein [2010]. It features three channels through which size dependent effective taxation can affect TFP: resource misallocation among incumbents, entry and exit of firms, and in- 4

centives to invest in innovation. 10 Our strategy to calibrate a productivity-dependent enforcement profile in the model is consistent with the identification strategy in the empirical analysis. Taking the estimates for the average size-gradient from the IV regressions for each income group, we use the firm size distribution in the US, and the models implication that size maps one to one with productivity in the undistorted equilibrium, to back-out a gradient between the probability of tax enforcement and the underlying productivity of the firms. We then evaluate the TFP gains from reversing the size-dependence in taxation. Our baseline exercise fixes the probability of compliance at the level corresponding to the median size and applies it to all firms in the economy while keeping the statutory revenue and profit tax rates unchanged. As an alternative, we consider a case where, in addition to fixing the probability of taxation across firms, we readjust the tax rates so as to preserve the overall share of tax revenue to GDP, and the share of profit-tax in total revenue. Our baseline counterfactual yields a TFP gain of 0.8% for the least developed group of countries, where the size-gradient is the highest, and is neutral for the richest group, where the compliance profile is flat. At the micro-level, the model yields predictions that are consistent with the evidence on cross-country differences in the average size and life-cycle growth of firms (Bento and Restuccia 2017, Hsieh and Klenow 2014). Average firm size increases by up to 30%, and the aggregate innovation intensity in the economy expands by more than 10%. The magnitude of the aggregate gains are weakened in the counterfactual with constant tax collection, reaching 0.3% for the lowest income group. The reason for the decline is that revenue and profit tax rates are increased in order to raise tax revenue to maintain its share in GDP. Since profit taxes discourage entry and innovation, even in the absence of size-dependence, the magnitude of the TFP gain is mitigated. 10 By investment in innovation we do not constraint to thinking just about R&D that leads to frontier changing innovations or new patents, but rather we think of a broader concept of intangible capital accumulation that may indeed constitute path-breaking innovations but could also refer to adoption of frontier technologies or implementation of better management practices. 5

Quantitatively, we find lower distortions than those generated by financial frictions but comparable in size to those found when evaluating labor market policies. In the context of the latter, for instance, Hopenhayn and Rogerson [1993] find TFP losses of 1 to 2% from taxes to firing and hiring workers. Gourio and Roys [2014] and Garicano et al. [2016] find almost zero gains from reversing a legislation that reduces the taxation of labor for small firms. Financial frictions are the most costly distortion, with potential gains from improving credit markets ranging from 5 to 40% depending on the margins of adjustment allowed for in the models (Midrigan and Xu 2014, Buera et al. 2011, Moll 2014). 1.1 Related Literature Our paper participates to two distinct literatures. First, an influential literature analyzes cross-country income differences through the misallocation of factor inputs across firms and sectors. Since the seminal work of Hsieh and Klenow [2009] and Bartelsman et al. [2013], many studies applied the same methodologies to characterize the full extent of misallocation around the world. Furthermore, Restuccia and Rogerson [2008] and Guner et al. [2008] helped gain awareness of the importance of the size-dependent component of the idiosyncratic distortions, showing that this is an important feature of the underlying policies generating the misallocation that magnifies the TFP losses associated with it. The pervasiveness of misallocation motivated the emergence of a number of studies investigating the allocative properties of specific policies or distortions. Among the most notable ones in this group are Hopenhayn and Rogerson [1993], Gourio and Roys [2014], and Garicano et al. [2016], evaluating policies related to labor-market regulation, and Buera et al. [2011], Midrigan and Xu [2014] and Moll [2014] focusing on credit market distortions. Our work contributes to the misallocation literature from the same angle as the latter set of studies. We provide identified estimates of a particular distortionary policy, size-dependence in tax enforcement and compliance, and quantify the implications of 6

this policy for aggregate TFP and firm behavior. Second, our exercise is related to the literature on tax enforcement and third-party information (Kopczuk and Slemrod 2006, Gordon and Li 2009, de Paula and Scheinkman 2010, Pomeranz 2015, Naritomi 2016). Almunia and Lopez-Rodriguez [2017] show that Spanish firm bunch at the size-threshold of the large taxpayer unit, while Zareh and Peichl [2016] find that Armenian firms bunch at the full account reporting threshold. Our results provide empirical support to theories where firm size is correlated with tax compliance (Kleven et al. 2016, Bigio and Zilberman 2011). In Kleven et al. [2016], tax avoidance strategies such as double book-keeping and collusion with employees to hide operations are impossible to sustain for large firms, since a single whistle blower can reveal the entire operation. Therefore, large firms disclose third-party information and comply with their tax obligations. In this model the government enforcement strategy is fixed, and increased tax compliance with development is driven by firms size growth. However, our results are also consistent with models where increased tax inspection with size is an optimal government policy, given fiscal capacity constraints (Bigio and Zilberman 2011, Ito and Sallee 2016). The paper is structured as follows. Section 2 discusses the data. Section 3 presents the identification strategy and empirical specifications. Section 4 shows the empirical results and their robustness. Section 5 presents the general equilibrium model, which is calibrated in Section 6 to quantify distortions from size-dependent tax enforcement. Section 7 concludes. 2 Data To provide global evidence on the relation between industry average firm size, tax inspection and tax compliance, we use the comprehensive firm-level data from the World Bank Enterprise Surveys (WBES). The surveys cover 125,000 firms in 140 countries between 7

2003 and 2015. A subset of countries have multiple surveys over time, with an average of 1.9 surveys per country. The World Bank outsources data collection to third-party agencies in order to remove the official affiliation of the surveyors and not contaminate responses. The survey agencies draw upon the list of registered establishments provided by the national statistics office. 11 The random stratified sampling is done at the industrylevel, corresponding to the 2-digit ISIC level, and over-samples from large firms to capture a large share of economic activity. Given the industry stratification, over-sampling of large firms does not impact the relative size of ISIC-2 industries. However it could impact the relative size of ISIC3 industries, which we use in some specifications, and where we have to assume that within ISIC3, firm sizes follow similar distribution shapes such that over-sampling at the top does not impact relative rankings. Firms with fewer than five employees, government-owned establishments and co-operatives are dropped from the sampling frame. The surveyors contact firms from this stratified-random sample and conduct the surveys with the person who most often deals with banks or government agencies. We measure size as the average of the log of number of employees per firm in a three digit ISIC sector, 12 excluding part-time and temporary workers. We calculate size for all ISIC3-country-year cells in the WBES. The size distribution in the US is drawn from the 2002 Census of Employment and Wages. To be consistent with sampling in the WBES, we exclude firms with five employees or less from the Census. 13 One caveat to is that we solely use industry level variation in average firm size. In the data, the intraclass correlation between log employee size and industries is 15%, and hence a large share of firm size variation arises within industries. We construct extensive and intensive margin measures of tax inspection within an 11 Sometimes supplemented with the list of firms registered with the chamber of commerce. 12 Sectors follow version 3.1 of the ISIC international industry classification. 13 In robustness tests we use an alternative size measure which places additional weight on the firms which have a higher proportion of the total sectoral employment as suggested by Davis and Henrekson [1999] and Kumar et al. [1999]. 8

ISIC3-country-year cell as, respectively, the share of firms which report an inspection by tax officials in the past 12 months, and the number of inspections over that period. To study tax compliance, we use the answer to the question: Recognizing the difficulties many enterprises face in fully complying with taxes and regulations, what percentage of sales would you estimate the typical establishment in your area of activity reports for tax purposes?. Full compliance is defined as the share of firms that report all of their sales. The reference to a typical establishment in your area of activity is meant to encourage firms to truthfully report either a reference group s behavior or the firm s own behavior. While we cannot infer whose behavior the firm is precisely referring to, we only require that the firm s reported compliance rate corresponds to its own ISIC 3 industry, a plausible assumption, and weaker than assuming that the answer corresponds to the firm s own compliance rate. We also construct the effective tax rate, defined as the product of the compliance rate and the statutory tax rate, where the tax rate is the sum of the corporate income tax rate and the general sales tax rate (or VAT). 14 In robustness tests we use informal payments, defined with the question: It is said that establishments like this one are sometimes required to make gifts or informal payments to public officials to get things done with regard to customs, taxes, licenses, regulations. On average, what percent of total annual sales do establishments like this one pay in informal payments or gifts to public officials for this purpose?. This question provides a direct measure of the informal tax rate, since the informal payments are expressed as a percentage of sales. Since we study outcomes at the industry level, we report summary statistics at the country-year-industry level, where the industry corresponds to the 3-digit ISIC. The 272 country-year surveys cover 140 countries, with on average 50 ISIC3 industries represented and a median at 52. The average industry surveys 10 firms. Table 1 displays the 14 We collected the statutory sales and corporate tax rates in the relevant year of the WBES country-year cell using the KPMG worldwide tax summaries. 9

number of observations, mean, standard deviation, and quartiles for each of the variables described above. The average number of tax inspections is just under 2 and 62% of firms receive at least one tax inspection visit in the year. The average compliance rate with taxes is 81%, and 57% of firms report full compliance. The average informal payment corresponds to 1.6% of firms sales and 26% of firms make such informal payments. Since the survey over-samples from larger firms it is not surprising to see that 22% of firms are exporters. Since the sampling frame is defined at the two digit ISIC level, manufacturing firms, which occupy a disproportionate share of the ISIC 2 categories, represent 58% of the sample of 3-digit industries. Finally, the average ISIC3 industry has 69 workers in the WBES, while the average industry has 53 workers in the 2001 US Census. Two points worth highlighting concern sample size and survey weights. First, while the core tax inspection variable is always available, data on tax compliance are only available for earlier surveys. 15 Second, when possible we apply survey weights. However weights are missing from some early surveys. The core tax inspection results are drawn from the sample with survey weights, but to preserve sample size, we report results on tax compliance without dropping observations with missing weights. We show in the appendix that results remain unchanged in the smaller sample with full survey weights. 3 Identification and Econometric Specifications 3.1 Identification and First-Stage Our objective is to estimate how firm size impacts tax inspection: Tax inspection ict = α + β Size ict + γ ct + ε ict (1) 15 The question on formal tax compliance was dropped from the harmonized survey after 2007. In surveys after 2007 administered in Angola, Botswana, Congo (Dem. Rep), Ethiopia, Iraq, Mali and Rwanda, we extracted the tax compliance question from the non-harmonized raw data. 10

The OLS regression of firm size on tax inspection is likely to suffer from reverse causality and omitted variable bias. In particular, firms might reduce their size to prevent facing more stringent tax inspection. To address this issue we turn to an instrumental variable strategy. A valid instrument predicts firm size and only impacts tax inspection through its effect on firm size. A vast literature finds that industries vary in their optimal scales and that there exists a structural technological demand for labor at the industry level. Our identification strategy follows the intuition of Rajan and Zingales [1998]: if we consider the US as an undistorted market, then US firms achieve their optimal unconstrained size, which depends on the structural scale parameter of their industry and idiosyncratic shocks. This suggests using the average size of firms in industries in the US as an instrument for the average size of firms in the same industry in lower-income countries. The first stage is estimated with the following regression: Rank size ict = α 0 + α 1 Rank size i,us01 + γ ct + ε ict (2) Where Rank size ict is the average firm-size rank of ISIC3 industry i, in country c, at time t, and Rank size i,us,01 is the rank by firm-size, of industry i, in the US census in 2001. γ ct are country-year fixed effects. The US industry ranking is drawn relative to the set of industries present in a given country-year, and we weight the regression results by the number of observations in a given ISIC3-country-year cell. 16 The slope-coefficient α 1 measures the increase in the size-ranking of an industry in its country, when moving along the ranking of industry size in the US. Table A1 shows that the results are robust to using average number of workers per industry rather than industry ranking based on number of workers. We implement the first stage with a rank-rank specification for two reasons. First, by using an industry s ranking in terms of average workers, the coefficient β only depends on the joint distribution of average size in the WBES country and aver- 16 This allows for ISIC3-country-year cells measured with greater precision to carry more weight in the estimates. Omitting weights does not change qualitatively the results, which remain significant. 11

age size in the US. Unlike the log-log specification, it does not depend on the marginal distributions of WBES and US industry-size. In other words, in a rank-rank specification β is not impacted by the ratio of US to WBES industry-size variances. The rank-rank specification is thus more stable across subsamples of different development levels with widely varying WBES size-variances. 17 Second, the rank-rank specification appears more able to untangle technological size-differences from non-technological differences. Nontechnological drivers such as availability of labor-saving instruments, labor regulations, and legal quality may differ substantially across countries and impact relative firm size, thus a wedge between WBES level-differences in size and US level-differences in size. Such non-technological drivers of size will not impact β so long as they do not overturn the ranking of industries in WBES relative to the US. Figure 2 displays non-parametrically the first stage relation between industries size ranks in the WBES countries and the US Census. ISIC3 industries are ranked relative to other industries in the same country-year survey. The WBES and US Census ranks are then grouped into 50 equal sized (two percentile) bins. Figure 2 plots the mean WBES size rank percentile within each 50-quantile US size rank percentile and the best-fit line. We find a steep positive slope and a linear relation between industries size-ranks in the US Census and their size-ranks in WBES countries. In Figure 3 we show that the rank-rank coefficients are constant by countries income levels: the US industry-size distribution has the same power to predict average firm size over the full size-distribution, in for example, Ethiopia, Indonesia and Mexico. In robustness analysis in Section 4.4, we show that predictive power does not hinge on using the 2001 Census, and remains almost identical when using either an earlier wave of the Census (1991) or a later wave (2015). Finally, the first stage remains strong and positive when we restrict our analysis to the manufacturing sector, a common restriction for studies using cross-industry variation. The WBES country size-ranking predicted from the US ranking of industries is an 17 This is also the main reason why a rank-rank specification is chosen over the log-log specification in recent studies of income mobility (Chetty et al. 2014). 12

exogenous source of size-variation under the assumption that the size of US firms is orthogonal to tax enforcement. Arguably, for large firms the IRS bases its decision directly on third-party reports and risk scores from economic activity and not indirectly based on size. In countries with high fiscal capacity, it is well-documented that non-compliance is concentrated among the self-employed and family firms with few employees (Blumenthal et al. 2001, Kleven et al. 2011). To alleviate concerns, we remove all firms with fewer than 5 employees in the US Census, which also matches the WBES sampling design. In Section 4.4, we show that the results hold when constructing the exogenous size-ranking with British and German firms in the Amadeus dataset, which are subject to stringent information reporting requirements. 18 Finally, in section 4.3 we show that for the richest countries in the WBES (countries with per capita GDP above $21,000) 19 the estimated coefficients of size on tax inspection is zero. This evidence supports the first stage identifying assumption that in high fiscal capacity countries, firm size is not driven by size-based inspection. The IV strategy provides a causal estimate of the size-gradient under the first stage validity, discussed above, and the exclusion restriction. The exclusion restriction assumes that the US size-ranking of WBES industries only impacts tax inspection in the WBES country through the WBES size-ranking. If tax inspection depends on employee-size only indirectly (e.g. by proxying for capital) the exclusion restriction holds as long as technological demand for labor does not impact capital other than through labor input. However, this assumption fails if the first-stage coefficient does not capture technological differences in employee-size, but is instead an imperfect proxy for capital input demand. To try to address this issue we construct two measures of demand for capital input in the US industry distribution: the demand for external reliance (Rajan and Zingales 1998) and 18 Amadeus data are collected from annual financial statements filed with the business registry and audited annual reports presented to shareholders. These firms are subject to a broader and deeper set of reporting requirements and hence in this sample tax inspection is unlikely to be driven by crude size-proxies. 19 The $21,000 cutoff was chosen to correspond to the 90th percentile of income in the WBES such that these countries are comparable to the US in terms of fiscal capacity. 13

the capital to labor ratio (Gordon and Li 2009). We show in Section 4.4 that when controlling for capital intensity, the coefficient on size remains unchanged, while the coefficient on capital intensity is insignificant. This suggests that technological demand for labor does not impact tax inspection indirectly through its interaction with capital. 3.2 Empirical Specifications The reduced-form size gradient is estimated by directly regressing the WBES industries tax outcomes on the US industries size rank: Tax outcome ict = δ 0 + δ 1 Rank size i,us,01 + γ ct + ε ict (3) Where Tax outcome ict is a tax outcome of industry i in country j at time t (e.g. likelihood of tax inspection over the past 12 months), Rank size i,us,01 is the size-ranking of industry i in the US, and γ ct are country-year fixed effects. The coefficient δ 1 identifies the reducedform size gradient. The IV specification is: Tax outcome ict = β 0 + β 1 Rank size ict + γ ct + ε ict (4) Where Rank size ict of industry i in country c at time t is instrumented with Rank size i,us,01 of industry i in the US. In practice, we estimate three different empirical models. The first model adds a set of controls to the above equations. It allows for tax outcomes to differ non-parametrically and interactively in every country and year across a set of industry characteristics. We code all industries as belonging to above or below their country-year median age, share of exporters and share of foreign firms. We then create the matrix (Characteristics) ict containing the full set of interactions across characteristics. The model allows for all factors 14

to impact tax outcomes in an interactive way, resulting in 1,937 fixed effects: Tax outcome ict = β 0 + β 1 Rank size ict + (Characteristics) ict (Year) t (Country) i + ε ict (5) The second model allows for the tax outcome to differ in every country, year and ISIC2 industry. This implies that variation relies on size differences of 3-digit ISIC industries within a 2-digit ISIC industry. For example, within ISIC category 25 "Manufacture of rubber and plastics products" it exploits variation in firm size between "Manufacture of rubber products" (Category 251) and "Manufacture of plastics products" (Category 252). This model estimates 6,130 fixed effects: Tax outcome ict = β 0 + β 1 Rank size ict + (ISIC2) ict (Year) t (Country) c + ε ict (6) Note that in practice, some ISIC2 sectors define the ISIC3 sectors, which leads to a drop in sample size. Further, the drop in size is larger for less developed countries where a smaller degree of specialization implies that some ISIC3 sectors are not represented for a given ISIC2. For transparency, we present results from each model side by side. The third specification exploits the panel structure of the data, which is available for a subset of countries (the average number of surveys per country is 1.9). 20 In the panel model, we add fixed-effects at the 3-digit ISIC level to the two previous models. Identification comes from variation in industries relative size ranks within a country and across time. For example the panel model mirroring equation 6 is defined as: Tax outcome ict = β 0 + β 1 Rank size ict + (ISIC2) ict (Year) t (Country) c + ISIC3 i + ε ict We report the coefficients β 1 of size on tax outcomes from all three sets of specifications (7) 20 Note that in the panel regressions we do not use the instrumental variable strategy. 15

for tax inspection and informal payments. Since the question on tax compliance was discontinued after 2007, we have very few repeated country surveys. Therefore we only report the coefficients β 1 of size on tax compliance for the first two models. 4 Results 4.1 Reduced Form and IV Estimates of Size Gradients Tax Inspection In this section we implement the econometric specification described in section 3.2. Figure 4 plots for each of the six largest countries in our sample, 21 an industry s size ranking on its average probability of tax inspection. Each dot represents an ISIC 3 industry and the size of the dot is proportional to its share of total employment within the country (based on the WBES). We plot the linear fit of size rank on tax inspection which slopes up in all six countries. We also note that on average industries with higher total employment have larger average firm sizes, however there is significant variation and some industries with a high share of total employment rank in the bottom half of average firm size. Table 2 reports the size gradient in tax enforcement along the extensive margin (any tax inspection over the past 12 months, Panel A) and the intensive margin (number of tax inspections over the past 12 months, Panel B). Panel A shows that industry-size is associated with a higher likelihood of tax inspection. Columns 3 and 4 show that the reduced form coefficients are significant in both types of (cross-sectional) fixed effect models. In columns 5 and 6, we estimate the corresponding IV-coefficients for each model. The first stage coefficients are strong (F-statistic of 260 and 105, respectively), and the size gradients are precisely estimated. The coefficient from our preferred specification in column 5 implies that a 10 percentile increase in exogenous WBES size-rank is associated with a 2.3 21 Bangladesh, Brazil, China, India, Indonesia and Mexico. 16

percentage point increase in the likelihood of tax inspection, a 3.8% increase relative to a mean tax inspection probability of 61.9%. Columns 7 and 8 exploit the panel dimension of the data, for the subset of WBES countries with multiple surveys. In these specifications the size gradients are estimated from changes in inspection from a switch in the relative ranking of ISIC3 industries across time. The coefficients in the panel regression are positive and significant and of similar magnitude as the IV estimates. Panel B, repeats the above regressions using as an outcome the number of tax inspections in the past 12 months. The coefficients remain significant and positive across specifications. The IV coefficient in column 5 suggests that a 10 percentile points increase in exogenous WBES size-rank is associated with a 0.14 increase in the number of tax inspections, relative to a mean of 1.98 inspections. The panel coefficients (columns 7-8) are very similar to the IV estimates. To gauge the extent to which the size gradient endogenously determines firm-size, we compare the OLS to the IV coefficient (respectively, comparing columns 1 and 5, and columns 2 and 6). The IV coefficient is larger in three out of four specifications, suggesting that firms might reduce their size to avoid increased tax inspection. Tax Compliance We now study the size gradient in tax compliance, which could, in part, be explained by increasing tax inspection with size. While the identification strategy is the same as above, the question on tax compliance only appears in the first waves of the WBES (2003 to 2007), which implies a smaller sample size and insufficient repeated country surveys to run panel regressions. Table 3 shows the size gradient on the extensive and intensive margins of tax compliance, where the extensive margin is the probability of full compliance (Panel A), and the intensive margin is the share of sales reported for tax purpose (Panel B). The IV coefficient in column 5 points to sizable effects of firm size on tax compliance: a 10 percentile increase in the WBES size-rank is associated with a 5.2 percentage point increase in the likelihood of full compliance, a 10.6% increase relative to a mean of 61.8% 17

and a 2.2 percentage point increase in the share of sales reported for tax purpose (relative to a mean of 80.9%). The wedge between the OLS and the IV coefficients suggests that firms depress their reported size in order to reduce tax compliance. 4.2 Non-linearities of Tax Inspection and Compliance with Size In Section 4.1 we show that tax inspection and tax compliance increase with industry size. Here we study whether this relation is linear or concentrated among specific segments of the size distribution. We present non-parametrically the reduced form relation between tax outcomes and the industry size ranking in the US. We first residualize the industry size-ranking in the US Census, N US, with respect to controls and country and year fixed effects as in equation 5. Similarly, we residualize the tax outcome of interest (e.g. tax inspection likelihood) with respect to the controls and fixed effects. We then split the residualized industry size-ranking N US into the size deciles, and normalize to zero the median industry s size. Figure 5 shows for each decile of industry size how tax inspection and compliance compare to the level of the 5th decile. Panel A shows that tax inspection is concentrated at the top of the industry size distribution and is flat in the bottom and middle of the distribution. Industries at the top of the size distribution appear 3% more likely to be inspected compared to the median industry. Panel B shows that the tax compliance rate mirrors inspection: an industry at the top of the size distribution reports 2% more sales compared to an industry in the middle. The mirroring pattern of tax inspection and tax compliance suggests that tax compliance is partly driven by size-dependent tax inspection. 4.3 Heterogeneity across Development Levels Reliance on size-dependent policies for tax enforcement could originate in state capacity: at lower levels of development the tax authority is constrained by a lack of information 18

on firms and has to resort to imperfect proxies such as size-dependent polices (Best et al. 2015, Bachas and Soto 2016). As a country s fiscal capacity grows (Kleven et al. 2016, Jensen 2016), the tax authority s reliance on size-dependent policies can be weakened. We therefore hypothesize that the size gradient in tax inspection decreases with a country s income level. To test this hypothesis, we estimate the tax inspection size gradient across subsamples of countries at five different levels of development. 22 Figure 6 shows the reduced form and IV coefficients for the likelihood of tax inspection at each income level. 23 The magnitude of the tax inspection size-gradient is decreasing over levels of development, both in the reduced form and IV specifications. 24 While we cannot reject that lower-middle, middle and higher-middle income groups have different sizegradients, 25 we can reject that their size-gradient is equal to that of high-income countries. In high-income countries both the reduced form and IV coefficients are centered at zero. The absence of size-dependent policies at high income levels is consistent with the theory that countries with strong fiscal capacity decrease their reliance on production-inefficient tax instruments. It also provides some support to the identifying assumption that tax inspection is orthogonal to size in a high tax capacity environment. 22 The groups correspond to the World Bank income classification, except for the top income group (above USD 21,000) which corresponds to the 90th percentile of income in the WBES and was chosen such that countries are comparable to the US in terms of fiscal capacity. 23 The first stage coefficients are constant across income levels and the 1st stage F-statistic is well above 10 at each income level. 24 The coefficients for the lowest income countries (GDP per capita below $1,100) are the largest but are only significantly different from zero at the 10% level. There are two explanations for this. First, this group contains fewer country-year observations than lower-middle, middle and higher-middle income groups. Second we weight ISIC-3 industries by their number of observations. Surveys in the poorest countries tend to be smaller in terms of number of firms and have fewer ISIC-3 industries represented. 25 These levels of development include the countries often studied in the misallocation literature such as China, India and Mexico. 19

4.4 Robustness Using different instruments The core results use the industry-size rank in the 2001 US Census. This section shows that results are robust to the choice of the exogenous size-distribution used as an instrument. Table A1 presents the first stage and IV coefficients on tax inspection, for six different definitions of the instrumental variable. Columns 1-3 report results using each of the last three waves of the US firm Census (1991, 2001, 2015). We obtain the same ranking of WBES size and similar IV coefficients. 26 Since the validity of the instrument rests on the assumption that the firm-size distribution in the US is undistorted by tax enforcement, using size rankings from other high fiscal capacity OECD countries should yield similar results. We test this by constructing the IV with British and German firms in the Amadeus database. In Great Britain the data are derived from audited annual reports presented to shareholders. In Germany, the data come from annual financial statements filed with the business registry. While these firms represent a selected sample compared to the US Census, they are subject to stringent reporting requirements and hence it is less likely that size proxies are used by tax authorities. Column 4 in Table A1 shows that the Amadeus size-ranking predicts a similar first-stage, and a large and significant IV coefficient. Restricting the sample to manufacturing To exclude the possibility that results are driven by a small set of peculiar ISIC3 industries, we limit the sample to manufacturing, which is over-sampled in the WBES (60% of WBES firms) and where most industries are well represented across countries. Table A2 shows the reduced form and IV results on the intensive and extensive margins of tax inspection. The 1st stage coefficient is the same as the coefficient in the full sample. The reduced form and IV coefficients are larger than in the full sample. 26 The first stage is constant across development-levels for each census wave (Figures A1 & A2). 20

Is worker-size proxying for capital intensity? A potential limitation to our identification strategy is that firm size, as measured by number of workers, could proxy for capital intensity. If this was the case, we are not capturing the direct impact of industry-size but rather the impact of more capital on tax inspection. To mitigate this concern, we construct an industry capital intensity ranking, using (1) the measure of reliance on external funds as in Rajan and Zingales [1998] and (2) the capital to labor ratio as in Gordon and Li [2009]. We then regress jointly industry-size ranking and capital ranking measures on tax inspection (Table A3). The inclusion of controls for capital intensity does not impact the results, and the coefficients on capital rankings are very small in magnitude and insignificant. 27 Substitution between formal and informal taxes along the size distribution Table A4 shows the size gradient on the likelihood of making an informal payment (Panel A) and the intensive margin as informal tax payments as a share of sales (Panel B). We find no effect on the extensive margin but do find an effect on the intensive margin. The IV coefficient in column (5) suggests that a 10 percentile increase in exogenous WBES size-rank is associated with a 0.11 percentage point decrease in the informal tax rate, a 7% increase relative to a base of 1.58% in sales for informal payments. 28 The coefficients on tax compliance and informal payments have opposite signs, which suggests substitution between formal and informal taxation. Full substitution could mitigate the importance of size-dependent tax enforcement. To test for substitution, we need both the informal and formal effective tax rates. Since firms report informal payments as a share of sales, we already have the effective informal tax rate. we construct the ef- 27 As an additional robustness test, we find that in the reduced form the US-size ranking is associated with outcomes which are predicted by theory to vary with labor demand (Lucas 1969; Kremer 1993): we find an increase in sales growth, labor cost per permanent employee, perceived constraints due to labor regulation, likelihood of having quality certification and likelihood of being part of a larger firm. 28 The negative slope shows that informal payments are regressive across firms. This finding complements Olken and Singhal [2011], who find that informal tax payments are regressive across households. 21