Wrong Skewness and Finite Sample Correction in Parametric Stochastic Frontier Models

Similar documents
Wrong Skewness and Finite Sample Correction in Parametric Stochastic Frontier Models

On the Distributional Assumptions in the StoNED model

A Monte Carlo Study of Ranked Efficiency Estimates from Frontier Models

The Stochastic Approach for Estimating Technical Efficiency: The Case of the Greek Public Power Corporation ( )

Volume 37, Issue 2. Handling Endogeneity in Stochastic Frontier Analysis

Statistical Evidence and Inference

Mean-Variance Analysis

WORKING PAPERS IN ECONOMICS. No 449. Pursuing the Wrong Options? Adjustment Costs and the Relationship between Uncertainty and Capital Accumulation

Alternative Technical Efficiency Measures: Skew, Bias and Scale

Chapter 5 Univariate time-series analysis. () Chapter 5 Univariate time-series analysis 1 / 29

Pseudolikelihood estimation of the stochastic frontier model SFB 823. Discussion Paper. Mark Andor, Christopher Parmeter

Solving dynamic portfolio choice problems by recursing on optimized portfolio weights or on the value function?

Mossin s Theorem for Upper-Limit Insurance Policies

Empirical Tests of Information Aggregation

Window Width Selection for L 2 Adjusted Quantile Regression

Conditional Investment-Cash Flow Sensitivities and Financing Constraints

Consistent estimators for multilevel generalised linear models using an iterated bootstrap

Real Wage Rigidities and Disin ation Dynamics: Calvo vs. Rotemberg Pricing

Expected Utility and Risk Aversion

Equity, Vacancy, and Time to Sale in Real Estate.

Time Invariant and Time Varying Inefficiency: Airlines Panel Data

Multivariate Statistics Lecture Notes. Stephen Ansolabehere

Published: 14 October 2014

OPTIMAL INCENTIVES IN A PRINCIPAL-AGENT MODEL WITH ENDOGENOUS TECHNOLOGY. WP-EMS Working Papers Series in Economics, Mathematics and Statistics

Introduction to the Maximum Likelihood Estimation Technique. September 24, 2015

Human capital and the ambiguity of the Mankiw-Romer-Weil model

Supply-side effects of monetary policy and the central bank s objective function. Eurilton Araújo

Principles of Econometrics Mid-Term

Behavioral Finance and Asset Pricing

Package semsfa. April 21, 2018

Omitted Variables Bias in Regime-Switching Models with Slope-Constrained Estimators: Evidence from Monte Carlo Simulations

Asset Pricing under Information-processing Constraints

Heterogeneity in Returns to Wealth and the Measurement of Wealth Inequality 1

A Test of the Normality Assumption in the Ordered Probit Model *

Effective Tax Rates and the User Cost of Capital when Interest Rates are Low

Fuel-Switching Capability

Journal of Economics and Financial Analysis, Vol:1, No:1 (2017) 1-13

Nonparametric Estimation of a Hedonic Price Function

STOCK RETURNS AND INFLATION: THE IMPACT OF INFLATION TARGETING

Measuring the Wealth of Nations: Income, Welfare and Sustainability in Representative-Agent Economies

Copula-Based Pairs Trading Strategy

FS January, A CROSS-COUNTRY COMPARISON OF EFFICIENCY OF FIRMS IN THE FOOD INDUSTRY. Yvonne J. Acheampong Michael E.

Booms and Busts in Asset Prices. May 2010

Are Financial Markets Stable? New Evidence from An Improved Test of Financial Market Stability and the U.S. Subprime Crisis

ECON Micro Foundations

PARAMETRIC AND NON-PARAMETRIC BOOTSTRAP: A SIMULATION STUDY FOR A LINEAR REGRESSION WITH RESIDUALS FROM A MIXTURE OF LAPLACE DISTRIBUTIONS

1. Money in the utility function (continued)

Research of the impact of agricultural policies on the efficiency of farms

Appendix for The Long-Run Risks Model and Aggregate Asset Prices: An Empirical Assessment

Chapter 7: Estimation Sections

MEASURING PORTFOLIO RISKS USING CONDITIONAL COPULA-AR-GARCH MODEL

Subject CS1 Actuarial Statistics 1 Core Principles. Syllabus. for the 2019 exams. 1 June 2018

Log-linear Modeling Under Generalized Inverse Sampling Scheme

Upward pricing pressure of mergers weakening vertical relationships

The quantile regression approach to efficiency measurement: insights from Monte Carlo Simulations

Bounding the bene ts of stochastic auditing: The case of risk-neutral agents w

Banking Concentration and Fragility in the United States

Estimation of a Ramsay-Curve IRT Model using the Metropolis-Hastings Robbins-Monro Algorithm

Volume 30, Issue 1. Samih A Azar Haigazian University

An Improved Saddlepoint Approximation Based on the Negative Binomial Distribution for the General Birth Process

A RIDGE REGRESSION ESTIMATION APPROACH WHEN MULTICOLLINEARITY IS PRESENT

The E ects of Adjustment Costs and Uncertainty on Investment Dynamics and Capital Accumulation

Wealth E ects and Countercyclical Net Exports

Financial Econometrics

Maximum Likelihood Estimation

Market Risk Analysis Volume I

Introduction to Sequential Monte Carlo Methods

FE670 Algorithmic Trading Strategies. Stevens Institute of Technology

Web Appendix. Are the effects of monetary policy shocks big or small? Olivier Coibion

Richardson Extrapolation Techniques for the Pricing of American-style Options

2018 outlook and analysis letter

E ects of di erences in risk aversion on the. distribution of wealth

Small Sample Performance of Instrumental Variables Probit Estimators: A Monte Carlo Investigation

For Online Publication Only. ONLINE APPENDIX for. Corporate Strategy, Conformism, and the Stock Market

درس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی

Extreme Return-Volume Dependence in East-Asian. Stock Markets: A Copula Approach

Statistical Analysis of Data from the Stock Markets. UiO-STK4510 Autumn 2015

Statistical Models and Methods for Financial Markets

Maximum Likelihood Estimation

NBER WORKING PAPER SERIES A REHABILITATION OF STOCHASTIC DISCOUNT FACTOR METHODOLOGY. John H. Cochrane

Internet Appendix for Can Rare Events Explain the Equity Premium Puzzle?

Questions of Statistical Analysis and Discrete Choice Models

Investment is one of the most important and volatile components of macroeconomic activity. In the short-run, the relationship between uncertainty and

Sharpe Ratio over investment Horizon

Equilibrium Asset Returns

Faster solutions for Black zero lower bound term structure models

Performance of Statistical Arbitrage in Future Markets

Sequential Decision-making and Asymmetric Equilibria: An Application to Takeovers

Efficiency Measurement with the Weibull Stochastic Frontier*

Research Memo: Adding Nonfarm Employment to the Mixed-Frequency VAR Model

Estimation of a parametric function associated with the lognormal distribution 1

Chapter 6 Forecasting Volatility using Stochastic Volatility Model

Modelling the Sharpe ratio for investment strategies

Intro to GLM Day 2: GLM and Maximum Likelihood

Online Appendix to Bond Return Predictability: Economic Value and Links to the Macroeconomy. Pairwise Tests of Equality of Forecasting Performance

Growth and Welfare Maximization in Models of Public Finance and Endogenous Growth

A Note on the Oil Price Trend and GARCH Shocks

Idiosyncratic risk, insurance, and aggregate consumption dynamics: a likelihood perspective

Implied and Realized Volatility in the Cross-Section of Equity Options

Estimating the Parameters of Closed Skew-Normal Distribution Under LINEX Loss Function

Transcription:

Wrong Skewness and Finite Sample Correction in Parametric Stochastic Frontier Models Qu Feng y Nanyang Technological University Guiying Laura Wu x Nanyang Technological University January 1, 015 William C. Horrace z Syracuse University Abstract In parametric stochastic frontier models, the composed error is speci ed as the sum of a twosided noise component and a one-sided ine ciency component, which is usually assumed to be half-normal, implying that the error distribution is skewed in one direction. In practice, however, estimation residuals may display skewness in the wrong direction. Model re-speci cation or pulling a new sample is often prescribed. Since wrong skewness is considered a nite sample problem, this paper proposes a nite sample adjustment to existing estimators to obtain the desired direction of residual skewness. This provides another empirical approach to dealing with the so-called wrong skewness problem. JEL Classi cations: C13, C3, D4 Keywords: Stochastic frontier model, skewness, MLE, constrained estimators, BIC The authors would like to thank the editor and three anonymous referees for their constructive comments and suggestions. We also thank Bill Greene for providing the airlines dataset. The comments of Peter Schmidt, Robin Sickles, Daniel Henderson and the particiants of the 011 Conference in Honor of Peter Schmidt, Houston TX are appreciated. y Email: qfeng@ntu.edu.sg, Tel: +65 659 1543. Division of Economics, School of Humanities and Social Sciences, 14 Nanyang Drive, Singapore 63733. z Email: whorrace@maxwell.syr.edu, Tel: 315-443-9061. Center for Policy Research, 46 Eggers Hall, Syracuse, NY 1344-100 x Email: guiying.wu@ntu.edu.sg, Tel: +65 659 1553. Division of Economics, School of Humanities and Social Sciences, 14 Nanyang Drive, Singapore 63733.

1 Introduction In parametric stochastic frontier models, the error term is composed as the sum of a two-sided noise component and a one-sided ine ciency component. For cross-sectional models, the noise distribution is assumed normal, while the ine ciency distribution is usually assumed to be halfnormal (Aigner, Lovell and Schmidt, 1977), exponential (Meeusen and van den Broeck, 1977; Aigner, Lovell and Schmidt, 1977), or truncated normal (Stevenson, 1980). It is sometimes gamma (Stevenson, 1980; Greene, 1980). For surveys, see Greene (007) and Kumbhakar and Lovell (000). In the widely used normal-half normal speci cation of the stochastic frontier production function model, the skewness of the composed error is negative, and parameters can be estimated by maximum likelihood estimation (MLE) or corrected ordinary least squares (COLS). 1 Waldman (198) shows that when the skewness of the ordinary least squares (OLS) residuals is positive, OLS is a local maximum of likelihood, and estimated ine ciency is zero in the sample. This "wrong skewness" phenomenon is widely documented in the literature and is often regarded as an estimation failure. 3 When it occurs, researchers are advised to either obtain a new sample or respecify the model (Li, 1996; Carree, 00; Almanidis and Sickles, 011; Almanidis, Qian and Sickles, 014; Hafner, Manner and Simar, 013). Simar and Wilson (010) argue that "wrong skewness" is not an estimation or modelling failure, but a nite sample problem that is most likely to occur when the signal-to-noise ratio (the variance ratio of the ine ciency component to the composite error) is small. That is, wrong skewness may not be an indication that the model is wrong or that ine ciency does not exist in the population. They propose a bootstrap method (called "bagging") to construct con dence intervals for model parameters and expectations of the ine ciency process which have higher coverage than traditional intervals, regardless of residual skewness direction. The sample under study can still be used to infer the model parameters. We follow Simar and Wilson s (010) view that wrong skewness is a consequence of a small 1 The skewness of the composed error is positive in the stochastic frontier cost function model. We use the terminology COLS following Olson, Schmidt and Waldman (198). COLS is also called MOLS. See Greene (007). Greene (007, p.131) claims "In this instance, the OLS results are the MLEs, and consequently, one must estimate the one-sided terms as 0." 3 For example, estimating the variance parameters in COLS is invalid in this case. However, as emphasized by Greene (007, note 9), this problem does not carry over to other model speci cations. Several sources can lead to wrong skewness. 1

signal-to-noise ratio in nite samples. However, instead of the bagging approach of Simar and Wilson (010), this paper provides a nite sample adjustment to existing estimators in the presence of wrong skewness. That is, we impose a negative residual skewness constraint in the MLE (or COLS) algorithm. A natural candidate for this constraint is the upper bound of the population skew, which is a monotonic function of the positive lower bound of the signal-to-noise ratio in the half-normal model. However, the constraint is non-linear in the parameters of interest, complicating computation of the optimum. Therefore, a linearization approximation of the constraint is proposed. Additionally, a model selection approach is proposed to determine the lower bound of the signal-tonoise ratio used in the constraint. A shortcoming of the approach is that in nite samples the linear approximation may not be accurate enough to guarantee a negative sign of residual skewness. In this case, additional nite sample adjustment is required. Monte Carlo experiments suggest that our correction becomes more reliable when the true signal-to-noise ratio is sizable. The proposed nite sample adjustment approach provides a point estimate with a correct sign of residual skewness that can be used in applied research. Since wrong skewness can occur fairly regularly (even when e ciency may exist in the population under study), the nite sample adjustment is attractive particularly in cases where the ine ciency distribution is half-normal. It is worthwhile to note that the proposed adjustment is only needed in nite samples, for as the sample size increases wrong skewness is less likely to be an issue when the signal-to-noise ratio is sizable. This rest of this paper is organized as follows. The next section discusses the wrong skewness issue in the literature. In Section 3, we propose a nite sample correction approach. To simplify computation of the proposed constrained estimation, a linearized version of constraint is used, so that constrained MLE (or COLS) can be easily implemented in most software packages. The constrained estimators are discussed in Section 4. In Section 5, Monte Carlo experiments are conducted to study the properties of constrained COLS. An empirical example is used to illustrate the proposed approach in Section 6. The last section concludes.

Wrong Skewness Issue A stochastic production frontier (SPF) model for a cross-sectional sample of size N is: y i = x 0 i + " i ; i = 1; ; N; (1) with composed error " i = v i u i : The disturbance v i is assumed iidn(0; v). Ine ciency of rm i is characterized by u i 0. In the SPF literature, u i is usually assumed to half-normal jiidn(0; u)j (Aigner, Lovell and Schmidt, 1977; Wang and Schmidt, 009), and independent of v i, with variance V ar(u i ) = u. The rst component of the p1 vector x i is 1, so the intercept term is contained into the p 1 slope parameter vector. As in Aigner, Lovell and Schmidt (1977) and Simar and Wilson (010), let = u + v and = u = v. The parameters to be estimated are = (; ; ). There are two primary estimators suggested in the literature: maximum likelihood estimator and corrected least squares (Aigner, Lovell and Schmidt, 1977; Olson, Schmidt and Waldman, 1980). Under the normal-half normal speci cation, the MLE of (; ; ) is the set of parameters values maximizing the likelihood function: ln L(; ; j(y i ; x i ); i = 1; :::; N) () = N ln( ) N ln P + N ln 1 p (y i x 0 i) 1 NP (y i x 0 i) ; where () is the standard normal cumulative distribution function. The COLS estimate of is simply the least squares slope estimate in the regression of y i on x i. However, the mean of " i = v i u i is negative due to the term u i, so the COLS estimate needs to be adjusted by adding the bias, p u=, back into the intercept estimator. The bias can be consistently estimated using the variance estimates: ^ u = r 4 =3 ^ 0 3 ; ^ v = ^ 0 ^ u; (3) where ^ 0 and ^ 0 3 are the second and third sample central moments of the least squares residuals. Both MLE and COLS are consistent. The Monte Carlo experiments in Olson, Schmidt and Waldman (1980) show that there is little di erence between MLE and COLS for the slope coe cients 3

in nite samples. For the intercept and variance parameters, however, MLE and COLS di er. In addition to MLE and COLS, Olson, Schmidt and Waldman (1980) also consider a third consistent estimator, the two-step Newton-Raphson estimator, which has di erent nite sample properties than MLE and COLS. Waldman (198) discovers an important property of MLE: for the likelihood function () above, the point (b; 0; s ) is a stationary point, where b and s are the OLS estimates of and. Intuitively, when = 0, the term u i disappears, so the likelihood function of the SPF model () boils down to one of a linear model with u i = 0. A salient result in Waldman (198) is that when the skewness of the OLS residuals is positive, i.e., ^ 0 3 > 0, then (b; 0; s ) is a local maximum in the parameter space of the likelihood function. 4 This is the so-called "wrong skewness issue" in the literature, because 0 3 < 0 in the normal-half normal model. Olson, Schmidt and Waldman (1980) refer to this phenomenon as "Type I failure" since the COLS estimator de ned in (3) does not exist when ^ 0 3 > 0. The Monte Carlo studies in Simar and Wilson (010) show that the wrong skewness issue is not rare, even when u i is relatively sizable. For example, the frequency of wrong skewness could be 30% for a sample of size of 100 when = u = v = 1. Wrong skewness casts doubt on the speci cation of the SPF model (Greene, 007). Moreover, it invalidates the calculation of standard errors of parameter estimates (Simar and Wilson, 010). Greene (007) considers OLS residual skewness a useful diagnostic tool for the normal-half normal model. Wrong skewness suggests there is little evidence of ine ciency in the sample, implying that rms in the sample are "super e cient". Thus, and u are assumed to be zero, and the stochastic frontier model reduces to a production function without the ine ciency term. 5 Another interpretation of the wrong skewness issue is that the normal-half normal model is not the correct speci cation. Other speci cations may well reveal the presence of ine ciency and reconcile the distribution of one-sided ine ciency with the data. The binomial distribution considered by 4 Waldman (198, p. 78) also suggests that (b; 0; s ) may be a global maximum. There are two roots in this normal-half normal model: OLS (b; 0; s ) and one at the MLE with positive. When the residual skewness is positive, the rst is superior to the second (Greene, 007, note 8). 5 Kumbhakar, Parmeter and Tsionas (013) propose a stochastic frontier model to accommodate the presence of both e cient and ine cient rms in the sample. 4

Carree (00) and doubly truncated normal distribution proposed by Almanidis and Sickles (011) and Almanidis, Qian and Sickles (014) could have either negative or positive skewness. They argue that models with ambiguous skewness may be more appropriate in applied research. Simar and Wilson (010) argue that wrong skewness is a nite sample problem, even when the model is correctly speci ed. 6 They show that a bootstrap aggregating method provides useful information about ine ciency and model parameters, regardless of whether residuals are skewed in the desired direction. We also regard wrong skewness as a consequence of estimation in nite samples when the signal-to-noise ratio V ar(u i )=V ar(" i ) is small. 7 Since the OLS residuals of a production function regression with u i = 0 display skewness in either direction with probability of 50%, a sample drawn from an SPF model with small signal-to-noise ratio could generate positively skewed residuals with high probability. 8 3 Finite Sample Correction As illustrated by Simar and Wilson (010), wrong skewness may occur when the signal-to-noise ratio is sizable, so simply setting u = 0 when the skewness is positive could be a mistake. Instead of improved interval estimates proposed by Simar and Wilson (010), this paper proposes a nite sample adjustment to existing estimators in the presence of wrong skewness. For MLE, a constraint with non-positive residuals skewness is imposed: s:t: 1 N max ln L(; ; j(y i ; x i ); i = 1; :::; N) NX 4q 1 N 3 y i y x 0 i x0 5 P N (y i x 0 i y x0 ) 3 0; (4) where y = 1 N P N y i and x = 1 N P N x i. Unfortunately, when implementing the maximum likelihood estimation with the inequality constraint de ned by (4), there is a practical issue. As 6 Waldman (198, p.78) notes that for u > 0 "as the sample size increases the probability that P e 3 t > 0 and hence that (b; 0; s ) locates a local maximum goes to zero." 7 Badunenko, Henderson and Kumbhakar (01) nd that the estimation of e ciency scores depends on the estimated ratio of the variation in e ciency to the variation in noise. 8 As pointed out by Simar and Wilson (010, p.7), this problem could happen in other one-sided speci cations. In a previous version of this paper, our Monte Carlo experiments suggest that wrong skewness could also occur with high probability in exponential and binomial SPF models, when the signal-to-noise ratio is small. 5

pointed out by Waldman (198), in the case of positive skewness of residuals, OLS (b; 0; s ) is a local maximum and the unconstrained MLE is equal to (b; 0; s ). Since, OLS is a local maximum in the parameter space of unconstrained MLE, the constraint (4) is always binding at the maximum, leading to a zero skewness of constrained MLE residuals. 9 If we regard the sign of residual skewness as an important indicator of model speci cation, the constrained MLE above seems unsatisfactory. We, therefore, propose a (negative) upper bound of skewness instead of zero in (4). This is relevant for empirical modeling. As in the empirical example below, when there is evidence of technical ine ciency in the data (Greene 007, p.0), its variance can not be too small, relative to that of the composite error " i. Denote the signal-to-noise ratio by k = V ar(u i )=V ar(" i ); instead of = u = v. 10 That is, a lower bound on the signal-to-noise ratio is implicitly imposed, k k 0. To develop the relationship between the upper bound of skewness and the lower bound of the signal-to-noise ratio, consider the second and third moment of " i. Under the normal-half normal speci cation, Olson, Schmidt and Waldman (1980) show that and It follows that the skewness of " i is " 3 " i E(" i ) E p = V ar("i )# where u is replaced with V ar(" i ) = v + u (5) E[" i E(" i )] 3 = 3 up =[( 4)=]: (6) u V ar(" i ) V ar(u i). Reparameterizing the skewness in terms of the signal-to- q q = k3=, with a constant = 4 ' 0:9953. noise ratio, we have g(k) = k 3= 4 3= p =[( 4)=] = V ar(ui ) V ar(" i ) 3= r 4 ; Since > 0, g(k) < 0 (e.g., g(0:1) 0:0315, g(0:) 0:0890 and g(0:3) 0:1635) and 9 This stems from the fact that Waldman (198) shows that OLS is local maximum in the parameter space of MLE when the OLS residuals are positively skewed. In fact, the non-positivity contraint will bind globally (when the OLS residuals are positively skewed), if OLS is a global maximum, as the Monte Carlo studies of Olsen, Waldman and Schmidt (1980) suggest. 10 Coelli (1995) also uses this signal-to-noise ratio measure, denoted by, in his Monte Carlo experiments. 6

g 0 (k) = 3 k1= < 0. An important property of g(k) is that it is a monotonically decreasing function of k. This implies that any upper bound, say g 0, of the population skewness, g(k) g 0, is equivalent to a lower bound, denoted by k 0, of the signal-to-noise ratio, k k 0, i.e., g 0 = g(k 0 ) < 0. We impose this upper bound on the sample skewness, by replacing 0 in the constraint (4) with the negative upper bound of the population skewness, g(k 0 ). Consequently, a modi ed constraint 3 1 NX 4 y i y x 0 3 i x0 q 5 N P g(k 0 ) N (y i x 0 i y x0 ) 1 N is used in the constrained MLE in the event of wrong skewness of the OLS residuals. Based on Waldman s (198) argument, the constraint above will also be binding at a maximum in the neighborhood of OLS. The constraint becomes 3 1 NX 4 y i y x 0 i x0 q 5 N P N (y i x 0 i y x0 ) 1 N 3 = g(k 0 ) (7) This nite sample adjustment gives a constrained estimator of parameter vector (; ; ). The constrained COLS slope coe cients can be similarly de ned. We use constraint (7), but replace the likelihood () with the sum of squared residuals as the objective function of a minimization problem. Since COLS reduces to OLS in the presence of wrong skewness and OLS is a local maximum of likelihood, as a nite sample adjustment to OLS, the constrained COLS slope coe cients are expected be close to their constrained MLE counterparts. 3.1 Linearizing constraint The non-linearity of in the constraint (7) creates computational di culties in calculating the constrained MLE. To simplify the computation process, a linearized version of the constraint (7) is considered. Given that OLS is a local maximum of likelihood in the presence of wrong skewness, empiricists normally start by estimating OLS with u i = 0. This is the rst step in LIMDEP (Greene, 1995) and FRONTIER (Coelli, 1996). If the skewness of the OLS residuals is positive, then OLS is the optimum and the point of departure for our linearization concept. Since the primary concern is skewness correction, we impose the additional restriction that the MLE residual variance 1 N P N (y i x 0 i y x0 ) is equal to that of OLS residuals, ^ 0. Thus, 7

the linearized constraint becomes: 1 N NP [y i y (x i x) 0 ] 3 = g(k 0 ) (^ 0 ) 3= : Denote f() = 1 N P N [y i y (x i x) 0 ] 3. The rst-order Taylor expansion of f() at the OLS estimate ^ OLS is: f() f(^ OLS ) + " # 0 @f() ( ^OLS); @ j^ols where @f() @ j^ols is the derivative of f() with respect to evaluated at ^ OLS. f(^ OLS ) is the 3 rd central moment of OLS residuals, i.e., ^ 0 3. Now, and @f() @ = 3 N NP [y i y (x i x) 0 ] (x i x); @f() @ j^ols = 3 N NP e i (x i where e i denotes the OLS residual y i x 0 i^ OLS with a sample mean equal to zero. Hence, an approximation of the constraint (7) is x); ^ 0 3 3 N NP e i (x i x) 0 ( ^OLS ) = g(k 0 ) (^ 0 ) 3= ; or [ 1 N NP e i (x i x)] 0 ( ^OLS ) = ^0 3 3 g(k 0 ) 3 (^0 ) 3= : (8) Letting the N 1 vector ~e be the squared OLS residual vector (e 1 ; :::; e N )0, the constraint above can be written in matrix form as 1 N ~e0 M 0 X( ^OLS ) = ^0 3 3 g(k 0 ) 3 (^0 ) 3= ; where M 0 = I 1 N 0 and = (1; :::; 1) 0. Thus, the linear constraint above can be written as R = q(k 0 ) (9) with R = 1 N ~e0 M 0 X and q(k 0 ) = R^ OLS + ^0 3 3 + 3 k3= 0 (^ 0 ) 3=, depending on the value of k 0. 11 11 It is worth noting that (9) is not a direct linearization of (7). Alternatively, a full linearization of (7) can be similarly obtained by replacing R = 1 N ~e0 M 0X with R = 1 N (~e0 0 M 0 p^ ^ 0 3e 0 )X. The additional term 1 0 p^ N ^ 0 3e 0 X is from the e ect of the denominator of the constraint in (7). Monte Carlo simulations suggest that the estimation results are robust to this choice. Details are available upon request. 8

Therefore, the proposed nite sample correction for MLE of (; ; ), i.e., the constrained MLE, is de ned as the solution to maximizing the likelihood () subject to the linear constraint (9). The corresponding estimators of u and v can be obtained by using the relationship = u + v and = u = v. Similarly, the constrained COLS of is de ned to minimize the sum of squared residuals subject to (9). As in the unconstrained estimation, the constrained estimators of u and v can be obtained by following formula (3). If k 0 = 0, then g(k 0 ) = 0 and the constraint above becomes R( ^OLS ) = ^ 0 3=3. This implies that the constrained and unconstrained estimators would be similar, since ^ 0 3 is usually very small in the presence of wrong skewness. In the extreme case of ^ 0 3 = 0, the constrained estimator reduces to OLS, which is a local maximum of the likelihood. Using the linearized constraint (9) the estimates, standard errors and con dence intervals of the constrained MLE and constrained COLS can be easily obtained by using Stata or other existing software. 1 However, since (9) does not necessarily guarantee a negative residual skewness in nite samples, there is a possibility that wrong skewness could still occur after our correction. The Monte Carlo experiments below show that this could be a concern only when the underlying signal-to-noise ratio is very small. 3. Choosing the value of k 0 The idea of the proposed constrained estimators is to adjust the slope coe cients to obtain a correct sign of residual skewness using the constraint (9), which is a function of k 0. It is expected that when the chosen value of k 0 is small, a slight adjustment results in the constrained MLE (or constrained COLS), and its value will be close to unconstrained MLE. Choosing a speci c value of k 0 is an empirical issue. On the one hand, when there is a priori evidence of ine ciency, the signal-to-noise ratio cannot be too small. On the other hand, as illustrated by the Monte Carlo study in Simar and Wilson (010), wrong skewness is less likely 1 In the empirical example below, the command Frontier in Stata, which allows for a linear constraint, is employed. 9

to occur as the signal-to-noise ratio increases. 13 In the spirit of this trade-o we develop model selection criteria to choose k 0. The idea is to incorporate a penalty function, so that as k 0 increases the penalty decreases. Hence, the t of the model and e ect of the constraint on the optimum can be balanced. 14 For the constrained MLE we propose a Bayesian information criterion (BIC) via the likelihood to choose the value of k 0 : BIC(k 0 ) = l r (k 0 ) k 0 ln N; where l r (k 0 ) is the log-likelihood evaluated at the constrained MLE of (; ; ), depending on k 0. Since OLS (b; 0; s ) is a local maximum of the log-likelihood function in the presence of positive skewness with a restriction on k 0, the value of l r (k 0 ) decreases with k 0 in the neighborhood of (b; 0; s ). 15 Di erent from the usual BIC, here we use a negative sign in front of the penalty term k 0 ln N so that l r (k 0 ) and k 0 ln N move in opposite directions with k 0. An optimal value of k 0 is chosen to minimize BIC(k 0 ): ~k 0 = arg min k 0 [0;1) BIC(k 0): Similarly, for the constrained COLS, a criterion based on sum of squared residuals is proposed to select the value of k 0 : C(k 0 ) = 1 N SSR r(k 0 ) k 0^ ln N " N ; where SSR r (k 0 ) is the sum of squared residuals of OLS with the constraint (9). C(k 0 ) is a Mallows C p -type criterion, similar to the expression proposed by Bai and Ng (00) to choose the number of factors in the approximate factor models, except that the penalty term takes a negative sign. 13 Table 1 in Simar and Wilson (010) provides some guidance. On the one hand, when 0:1 (i.e., k = 1=( 1 + 1) < 0:035) for samples with size less than 00, the proportion of wrong skewness is close to 50%, implying that the ine ciency term is hard to distinguish from noise. On the other hand, when 1 (k 0:67), the wrong skewness probability decreases dramatically. For example, only 6% of samples display wrong residual skewness for = (k = 0:41) and N = 00. In Appendix, we have a similar nding both for Simar and Wilson s design and the design in Section 5 of this paper. 14 We thank one referee for pointing this out to us. 15 The constraint k k 0 is always binding in the neighborhood of OLS. And a restriction on k is equivalent on, which is a monotonic increasing function of k in the half-normal model, s s s s = u V ar(u i) k 1 = = = v (1 k) (1=k 1) : v 10

By applying the properties of the usual restricted least squares, it can be shown that SSR r (k 0 ) increases with k 0. (See the appendix.) Hence, the e ect of increasing k 0 on the model t can be balanced by the penalty term, thus an appropriate value of k 0 is chosen to minimize C(k 0 ): ^k 0 = arg min C(k 0): k 0 [0;1) The estimated error variance ^ " provides an appropriate scaling to the penalty term. Here, we use ^ " = 1 N SSR, where SSR is the sum of squared residuals of OLS without constraint. In practice, to nd the value of ~ k 0 (or ^k 0 ) a grid search can be applied to BIC(k 0 ) (or C(k 0 )) starting from a small positive value, e.g., 0:05. Since the measures of the model t in the constrained MLE and COLS, i.e., the objective functions in the penalized least squares and penalized maximum likelihood, are di erent, ~ k 0 is not necessarily equal to ^k 0. However, in the neighborhood of OLS (b; 0; s ) with a small value of, P when the term N h i ln 1 p (y i x 0 i ) in l(; ; ) has small values of partial derivatives in the rst-order conditions, ~ k 0 is expected to be close to ^k 0. 4 Constrained Estimators With the proposed nite sample adjustment above, the sample can still be used to construct a point estimate for inferring population parameters in the presence of wrong skewness. This is similar in spirit to Simar and Wilson (010), who still rely on the MLE estimation results, but provide more accurate interval estimates using improved inference (bagging) methods. As previously mentioned, any negative constraint on sample skewness is binding in the presence of wrong skewness. This result implies that estimated (or k) is implicitly determined by the constraint (9). Consequently, it is biased when the selected value of k 0, the lower bound of k, is not equal to the true value of k. Inconsistency of the proposed constrained estimators might be a concern. However, this concern may be overstated. The wrong skewness is a nite sample issue under the true speci cation. As the sample size increases, wrong skewness is less likely to appear, so the proposed nite sample adjustment becomes unnecessary. Thus, asymptotics are less of a concern here. In addition, with the nature of nite sample adjustment, the proposed method is 11

regarded as an adjustment to existing estimators, rather than a new estimator. 16 In the next subsection, properties of constrained estimators are studied. Since the constrained COLS is essentially restricted least squares, which has an analytical solution, we mainly focus on it. 4.1 Constrained COLS The proposed constrained COLS, denoted by ^ r, is a -step estimator. In the rst step, for a given k 0, the constrained COLS ^ r (k 0 ) is de ned as the solution of min SSR() = min (Y X) 0 (Y X) s:t: R = q(k 0 ): In the second step, k 0 is selected such that ^k 0 = arg min k0 C(k 0 ), where C(k 0 ) = 1 N (Y X ^ r (k 0 )) 0 (Y X ^ r (k 0 )) k 0^ " ln N N. The proposed constrained COLS is de ned as ^ r = ^ r (^k 0 ). This -step estimator is equivalent to a 1-step penalized least squares with the linear constraint: 1 min ;k 0 N (Y X)0 (Y X) k 0^ ln N " N s:t: R = q(k 0 ): This equivalence comes from the fact that in the objective function k 0 only appears in the penalty term k 0^ " ln N N. Thus, can be concentrated out for a given k 0. For a given k 0, ^ r (k 0 ) is the restricted least square. By Amemiya (1985) or Greene (01), ^ r (k 0 ) = ^ OLS (X 0 X) 1 R 0 [R(X 0 X) 1 R 0 ] 1 [R^ OLS q(k 0 )]; and SSR r (k 0 ) = SSR + [R^ OLS q(k 0 )] 0 [R(X 0 X) 1 R 0 ] 1 [R^ OLS q(k 0 )]: Thus, the criterion is C(k 0 ) = 1 N SSR + 1 N [R^ OLS q(k 0 )] 0 [R(X 0 X) 1 R 0 ] 1 [R^ OLS q(k 0 )] k 0^ ln N " N : Minimizing C(k 0 ) de nes ^k 0. The follow proposition proves the existence and uniqueness of ^k 0. 16 In this sense, our approach is di erent from the literature on models with moment conditions, e.g., Moon and Schorfheide (009). 1

Proposition 1 In the presence of positive skewness of OLS residuals, i.e., ^ 0 3 > 0, (i) dssrr(k 0) dk 0 > 0; (ii) for a reasonable sample size N, there exists a solution for ^k 0 such that ^k 0 minimizes C(k 0 ); (iii) d C(k 0 ) dk 0 > 0, implying that ^k 0 is the unique solution. The proof is included in the Appendix 1. Since ln N N! 0, when N! 1, compared with the rst term 1 N SSR r(k 0 ), the penalty term in C(k 0 ) can be ignored asymptotically. This implies that ^k 0! 0 as N! 1. Hence, when N is large the proposed constrained COLS approaches the OLS with constraint R( ^OLS ) = ^ 0 3=3, which is very close to OLS in the presence of wrong skewness. For a given sample, the di erence between OLS and the constrained COLS ^ OLS ^r = (X 0 X) 1 R 0 [R(X 0 X) 1 R 0 ] 1 [R^ OLS q(^k 0 )] depends on ^k 0, and d[^ OLS ^r ] d^k 0 = (X 0 X) 1 R 0 [R(X 0 X) 1 R 0 ] 1 ^k 1= 0 (^ 0 ) 3= implying that the magnitude of this di erence is positively correlated with the chosen value ^k 0. 4. Constrained MLE For a given k 0, the constrained MLE (^ CMLE (k 0 ); ^ CMLE (k 0 ); ^ CMLE (k 0)) depends on k 0. Minimizing BIC(k 0 ) determines the value of k 0, i.e., ~ k 0 = arg min k0 [0;1) BIC(k 0 ). Similar to the constrained COLS, (^ CMLE ; ^ CMLE ; ^ CMLE ) is de ned as (^ CMLE ( ~ k 0 ); ^ CMLE ( ~ k 0 ); ^ CMLE (~ k 0 )). It can also be written as a penalized maximum likelihood estimator with a constraint, where l(; ; ) = N ln( ) in (). min l(; ; ) k 0 ln N ;; ;k 0 N ln P + N s:t:r = q(k 0 ); h ln 1 p (y i x 0 i ) i 1 NP (y i x 0 i ) de ned Since there is no analytical solution to the constrained optimization problem above, it is di cult to derive the properties of constrained MLE. 13

term However, dividing by N, 1 N BIC(k 0) = N l r(k 0 ) k 0 ln N N, compared with N l r(k 0 ), the penalty k 0 ln N N can be asymptotically ignored as N! 1, implying that ~ k 0 tends to 0 as N! 1. Since ~ k 0 is small when N is large, the proposed constrained MLE is expected be close to MLE. Since the MLE of slope parameters is very close to OLS, the constrained MLE and constrained COLS are expected to be close. We now consider the di erence between constrained MLE and OLS by examining the rst-order conditions of (). Aigner, Lovell and Schmidt (1977) show that: @ ln L @ = @ ln L @ @ ln L @ N + 1 NP 4 (y i x 0 i) + NP () 3 1 () (y i x 0 i) = 0; (10) 1 NP () = 1 () (y i x 0 i) = 0; (11) = 1 NP i x (y 0 i)x i + NP () 1 () x i = 0; (1) where () is the standard normal density function. () and () are evaluated at (y i x 0 i ) = " i. Waldman (198) shows that in the presence of wrong skewness = 0 and OLS is a local maximum of the log-likelihood. For our constrained MLE, the constraint (7) or (8) involves the value of k 0, not directly. Since is a monotonic increasing function of k, k k 0 implies s 1 (1=k 0 1) : (13) To show how restricting a ects the estimation result and how the constrained MLE of is di erent from the OLS, consider equation (1). 17 Taking the rst-order Taylor expansion at = 0 gives Thus, (1) becomes P 0 = N (y i x 0 P i)x i + N = (1 + P ) N (y i x 0 i)x i + ( " r i) 1 ( " i) + " i: ( " r i) 1 ( " i) x P i N (y i x 0 P i)x i + N ( + r P N x i : " i)x i 17 Strictly speaking, restricting as a constraint yields a di erent result from constraint (7). Though the population skewness is equal to g(k 0) and thus a monotonic function of, the sample skewness is not a function of. However, the insights derived here on the e ect of the chosen value of k 0 on estimation still apply. 14

That is, q NP (y i x 0 i)x i + p P (1 + N x i = 0: (14) ) In matrix form, the equation (14) above can be written as where '() = X 0 y X 0 X + '() p X 0 = 0 (15) q =(1 + ) and is the N 1 vector of ones. Equivalently, ^ CMLE ' (X 0 X) 1 X 0 y + '() p (X 0 X) 1 X 0 : (16) In the presence of wrong skewness, OLS (i.e., = ' = 0) is a local maximum of the log-likelihood. Under the constraint (13), the estimator of is adjusted by the second term in equation (16). Given the fact that '() is monotonically increasing in in the range [0, p = 1:533], the di erence between the constrained MLE and the OLS of is positively related to the value of. 18 The larger (or k 0 ) is imposed, the bigger is the di erence between the OLS and the constrained MLE. Furthermore, in a given sample this di erence depends not only on '(), but also on the sample value of the regressors and jointly determined by rst-order equations. We conjecture that constraint (9) with a small value of k 0, slightly adjusts the estimators of and v, but has a much larger e ect on the estimated u and. This point is con rmed in the Monte Carlo experiments and empirical example below. 5 Monte Carlo Experiments In this section, Monte Carlo experiments are conducted to study how the proposed constraints a ect the estimates, and how the chosen value of k 0, the imposed lower bound of k, is a ected by the sample size. Since constrained COLS has an analytical solution, established results in the previous section, it is the focus of this section. We consider a speci cation y i = 0 + 1 x 1i + x i + " i ; " i = u i + v i ; i = 1; ; N; 18 For a small value of k 0, e.g., k 0 [0:1; 0:3], lies in the interval [0:5530; 1:0860]. 15

where 0 = 1; 1 = 0:8; = 0:, x 1i log(jn(4; 100)j), x i log(jn(; 60)j), v i N(0; v) and u i jn(0; u)j. k = V ar(u i )=V ar(" i ) is the signal-to-noise ratio. 19 u = V ar(u i) = kv ar(" i) and v = (1 k)v ar(" i ). We set V ar(" i ) = v + V ar(u i ) = 0:06, so the variance of x 1i and V ar(" i ) are comparable to those in the empirical example below. Since the focus is the proposed correction for samples with wrong residual skewness, we drop the samples with correction skewness. The number of replications is 4000 after dropping the samples with correction skewness. We conduct experiments with k = 0:1, 0:, 0:3, 0:5 0:7 and N = 50, 100, 00. 0 Table 1 reports the main results of simulations. Column () gives the average value of ^k 0. To obtain ^k 0 for each sample, a grid search is conducted to minimize C(k 0 ) on the interval [0:05; 0:9]. As expected, the average value of ^k 0 decreases with N. Column (3) shows that there is a possibility of wrong skewness after the proposed nite sample correction. This is the cost of the linear approximation. The frequency depends on the signal-to-noise ratio and sample size, varying from 16:3% to 39:9%. For example, for k = 0:5, N = 100, our nite sample correction approach could fail with a possibility of 7:1%. 1 For parameter estimators, columns (4)-(7) indicate that with the correction of ^ u p =, the constrained COLS of 0 is less biased than the OLS, but with a much bigger root mean squared errors (RMSE). But when k and N increase, the RMSE of constrained COLS is comparable to that of OLS. (Bias and RMSE of OLS of 0 (and 1 ) are included in columns (5), (7) (and (9), (11)) for comparison). In addition, compared with OLS, the constrained COLS of 1 is slightly upward biased with bigger RMSE, and the bias and RMSE decrease with k and N. In the presence of wrong skewness, u is usually considered to be zero. Using our correction, column (1) shows that the estimated u tends to be overestimated for a small value of k and underestimated for a big value of k. Compared with u, v can be estimated more accurately in terms of bias, as indicated in column (13). Column (14) of Table 1 shows that k is generally underestimated. This is due to the fact that 19 Coelli (1995) also uses this signal-to-noise ratio measure, denoted by, in his Monte Carlo experiments. 0 In Appendix, we report the frequency of wrong residual skewness of these combinations, showing that it is very unlikely to have wrong skewness when k 0:7 and N 100. 1 For k = 0:5, N = 100, the possibility that the wrong skewness occurs is 9%. See Appendix. 16

a relatively small value of ^k 0 is often chosen when N is large, and that the estimated k is implicitly determined by ^k 0 suggested by the linear constraint (9). Finally, column (15) reports the bias of the mean technical e ciency E[exp( u i )] = exp( u=)[1 ( u )]. In the presence wrong skewness, the traditional suggestion is that the estimated u is 0, implying that the estimated mean technical e ciency is 1. This practice obviously overestimates the true mean technical e ciency. Column (15) shows that the mean technical e ciency estimator using the proposed correction could be unbiased for a sizable value of k, say, 0: here under the current design. It is downward biased for a small value of k, and upward biased for k > 0:. 6 Empirical Example: the US Airline Industry In this section, an airlines example is used to illustrate our approach. This is an unbalanced panel data set with 56 observations. See Greene (007) for detailed information of this data set. In this example, the dependent variable is the logarithm of output and the dependent variables include the logarithm of fuel, materials, equipment, labor and property. Here, the unbalanced panel is treated as a cross section for 56 rms to ensure that the wrong skewness issue arises. 3 Column () of Table presents the OLS estimates along with standard errors (column 3). Except for the constant term, the slope coe cients are consistent with Table.11 in Greene (007). The OLS residual skewness (0:0167) is in the wrong direction for the estimated normal-half normal model. Thus, the estimates of and u are set to zero and rms are considered to be "super e cient". However, Greene (007, footnote 84) does suggest that there is evidence of technical ine ciency in the data. The second root of the likelihood with positive is reported in the second section of Table. This MLE yields a small positive residual skewness 0:0093. 4 Usually, in the presence of "wrong" skewness, researchers are advised to obtained a new sample or respecify the model. But this is not a big concern since k is not a parameter of interest in this model. 3 With the exception of perhaps Green and Mayes (1991), Mester (1997) and Parmeter and Racine (01), there appear to be very few empirical studies with wrong skewness in the literature. As in Greene (007, Table.11), we use this panel data example as a cross-sectional one only for the purpose of illustration. 4 Inconsistent with the statements of Waldman (198) and Greene (007) the MLE with positive achieves a slightly bigger value of log-likelihood than OLS for this dataset. Similarly, the inconsistency between OLS and MLE in the presence of positive OLS residual skewness by using FRONTIER is discussed by Simar and Wilson (009). Greene (007, p.0) notes: "... for this data set, and more generally, when the OLS residuals are positively skewed, then there is a second maximizer of the log-likelihood, OLS, that may be superior to the stochastic frontier." 17

Instead, we use the constrained MLE (and constrained COLS), a nite sample adjustment to the existing MLE (and COLS). The optimal value of k 0 can be chosen by BIC(k 0 ) (and C(k 0 ) for the constrained COLS) proposed above. For purposes of illustration, we present constrained MLE results of k 0 = 0:05, 0:1, 0:15, and 0: in columns (6)-(13) of Table and compare the values of BIC(k 0 ), showing that ~ k 0 = 0:15 achieves the minimum of BIC(k 0 ). Thus, the constrained MLE of and u are positive, 0:689 and 0:1015 respectively. Furthermore, consistent with the negative population skewness of the composed error, the skewness of constrained MLE residuals ( 0:0599) has the desired sign. Since the constraint slightly adjusts the coe cients of constrained MLE, as expected, the rest of the coe cients are very close to the unconstrained MLE and OLS. For example, the constrained estimated coe cient of variable Log fuel is 0:3907 (column 10), while its unconstrained counterpart is 0:3836 (in column 4) and OLS coe cient is 0:388 (in column ). Consistent with the analysis in Section 4., the di erence between the constrained MLE slope coe cients and its OLS (and unconstrained MLE) counterparts is positively related to the magnitude of k 0. The bigger the value of k 0, the larger is the di erence. However, this di erence is relatively small. For example, the constrained estimated coe cients of variable Log fuel using k 0 = 0: is 0:3939 (in column 1 of Table ), compared with the OLS 0:388 and the unconstrained MLE 0:3836 (in columns and 4 of Table ). This is also the case for v,. In stark contrast to this small di erence in slope coe cients, the residual skewness and estimated k change signi cantly, since they are implicitly determined by the chosen value of k 0 in the constraint. Another important point observed in Table is that the value of the likelihood decreases with k 0. 5 The results of constrained COLS are reported in columns (6)-(13) of Table 3 and are very close to their constrained MLE counterparts for given values of k 0 = 0:05, 0:1, 0:15, and 0:. 6 However, 5 This property can be obtained by the equation (3) in Waldman (198, p.78): r l = 3 4 6s 3 NP e 3 i where can be regarded as changing from 0 as in the analysis in Section 3.1. Since 4 < 0, in the presence P wrong skewness ( N e 3 i > 0), the log-likelihood decreases with the imposed value of (and k 0). 6 The constant term is calculated by OLS intercept plus p ^ u=. The standard errors formula of the COLS estimators of constant term, and (not ) can be found in Coelli (1995). 18

for the constrained COLS, the optimal value of k 0 is 0:1 by applying Mallows C p -type criterion C(k 0 ) proposed above. (Table 3 reports N C(k 0 ) instead of C(k 0 ).) This is slightly di erent from ~k 0 = 0:15 by minimizing BIC(k 0 ) in the constrained MLE. Therefore, the constrained COLS of u is 0:0853 and skewness is 0:035 in column (8). It is worth mentioning that the value of criterion C(0:15) is nearly equal to C(0:1) in this empirical example, implying that BIC(k 0 ) for the constrained MLE and C(k 0 ) for the constrained COLS result in similar optimal values of k 0. Since the proposed nite sample adjustment restricts the signal-to-noise ratio, it indirectly a ects the estimated u. In this example, it is 0:1015, for the constrained MLE. Consequently, the mean technical e ciency estimate, exp(^ u=)[1 (^ u )], depends on the chosen value of k 0. However, e ciency rankings appear to be preserved under di erent choices of k 0. For the unconstrained MLE, the least e cient rm is the 79 th with technical e ciency.8958. If we impose k 0 = 0:05, 0:1, 0:15, 0: in the constraint, the technical e ciency becomes.8583,.8308,.8015,.77 respectively, and it remains lowest among the 56 rms. The most e cient rm is the 50 th with technical e ciency.9696, 0.9669,.9655,.9644,.9636 for the unconstrained MLE and constrained MLE with k 0 = 0:05, 0:1, 0:15, 0:, respectively. This is also the case for the median rm. 7 Conclusion This paper studies the wrong skewness issue in parametric stochastic frontier models. Following Simar and Wilson s (010) opinion, we regard wrong skewness as a consequence of estimation in nite samples when the signal-to-noise ratio is small. In nite samples the data may fail to be informative enough to detect the existence of ine ciency term in stochastic frontier models, even though the population signal-to-noise ratio could be fairly large. Thus, the resulting residuals could display skewness in either direction with probability of as high as 50%. Instead of respeci cation or interval estimates proposed by Simar and Wilson (010), or redrawing a sample in the presence of wrong skewness, we propose a feasible nite sample adjustment to existing estimates. When there is evidence of ine ciency, it is reasonable to impose a lower bound on the signal-to-noise ratio in the normal-half normal model, equivalent to a negative upper bound 19

on the residual skewness. Thus, we propose to use this negative bound on residual skewness as a constraint in the MLE and COLS in the event of wrong skewness. The idea of the proposed constrained estimators is to slightly adjust the slope coe cients in nite samples. They provide a point estimate that yields a negative residual skewness, though a correct sign of residual skewness is not always guaranteed. Since the constraint is based on k 0, the choice of k 0 a ects estimation results. A model selection approach is proposed to select k 0. Monte Carlo experiments show that the bias of constrained estimates is less of a concern when sample size is large and signal-to-noise ratio increases. The empirical example in this paper also shows that the value k 0 has little e ect on the estimated slope coe cients and v,, while the residual skewness and estimated k are implicitly determined by the value of k 0. In this sense, the proposed method can be regarded as a nite sample adjustment to existing estimators, rather than a new estimator. When the sample size is large, since wrong skewness is less likely to occur, such adjustment becomes unnecessary. 0

Table 1: Monte Carlo Results: Constrained COLS k 0 frequency* σ u β 0 σ v k mean efficiency N average bias OLS bias rmse OLS rmse bias OLS bias rmse OLS rmse bias bias bias bias (1) () (3) (4) (5) (6) (7) (8) (9) (10) (11) (1) (13) (14) (15) k= 0.1 50 0.059 0.399-0.05-0.067 0.458 0.088 0.009 0.000 0.11 0.0 0.08 0.006 0.083-0.05 100 0.051 0.394-0.037-0.068 0.57 0.078 0.017 0.000 0.70 0.015 0.091 0.000 0.07-0.03 00 0.050 0.374-0.017-0.067 0.533 0.07 0.005 0.000 0.140 0.011 0.064-0.001 0.079-0.04 k= 0. 50 0.059 0.395-0.063-0.096 0.344 0.111 0.009 0.000 0.105 0.0 0.08 0.003-0.06 0.006 100 0.05 0.379-0.057-0.095 0.363 0.103 0.006 0.000 0.111 0.015 0.034 0.003-0.036 0.007 00 0.050 0.353-0.041-0.095 0.437 0.098 0.004 0.000 0.10 0.011 0.080 0.000-0.045 0.006 k= 0.3 50 0.061 0.366-0.075-0.117 0.97 0.130 0.005 0.000 0.087 0.0 0.018 0.004-0.18 0.07 100 0.05 0.340-0.076-0.117 0.373 0.13 0.006 0.000 0.097 0.015 0.035 0.006-0.161 0.030 00 0.050 0.31-0.065-0.117 0.35 0.10 0.000 0.000 0.091 0.011 0.031 0.00-0.164 0.036 k= 0.5 50 0.06 0.319-0.110-0.150 0.7 0.160 0.006 0.000 0.086 0.0-0.010 0.009-0.35 0.061 100 0.053 0.84-0.105-0.150 0.6 0.155 0.003 0.000 0.069 0.015-0.005 0.008-0.380 0.070 00 0.050 0.48-0.108-0.151 0.36 0.153 0.004 0.000 0.087 0.010 0.006 0.009-0.400 0.073 k= 0.7 50 0.068 0.70-0.138-0.176 0.15 0.185 0.003 0.000 0.076 0.03-0.038 0.013-0.571 0.09 100 0.054 0.08-0.138-0.179 0.15 0.183 0.003 0.000 0.057 0.016-0.039 0.013-0.606 0.099 00 0.050 0.163-0.138-0.178 0.188 0.180 0.00 0.000 0.045 0.011-0.041 0.014-0.64 0.105 Note: 1. The frequency of wrong skewness after the proposed finite sample correction.. Columns (5), (7), (9) and (11) refer to the bias and root mean squared error of OLS of β 0 and β 1. β 1 1

Dependent variable: Log output OLS Table : Estimates of Airlines Example: Constrained MLE MLE k 0 =0.05 k 0 =0.1 k 0 =0.15 k 0 =0. Variables estimates se estimates se estimates se estimates se estimates se estimates se (1) () (3) (4) (5) (6) (7) (8) (9) (10) (11) (1) (13) Log fuel 0.388 0.071 0.3836 0.0707 0.3857 0.0696 0.3879 0.0697 0.3907 0.0698 0.3939 0.0700 Log materials 0.719 0.0773 0.7167 0.0800 0.7097 0.0685 0.708 0.0685 0.6940 0.0685 0.6836 0.0685 Log equipment 0.19 0.0739 0.196 0.0731 0.01 0.0730 0.06 0.079 0.10 0.079 0.15 0.079 Log labor -0.4101 0.0645-0.4114 0.0648-0.4146 0.0618-0.4178 0.0618-0.419 0.0618-0.467 0.0618 Log property 0.1880 0.098 0.1897 0.0336 0.1945 0.0184 0.199 0.0184 0.053 0.0184 0.16 0.0185 constant -0.9105-0.856 0.1835-0.844 0.1103-0.8361 0.0959-0.879 0.0870-0.8199 0.0814 σ v 0.1608 0.1553 0.0338 0.157 0.019 0.1508 0.0196 0.1486 0.0180 0.1463 0.0169 σ u 0.0000 0.0676 0.098 0.080 0.1079 0.0917 0.0848 0.1015 0.069 0.1110 0.0586 σ 0.059 0.087 0.018 0.0301 0.0116 0.0311 0.0103 0.034 0.0095 0.0337 0.0089 λ 0.0000 0.4351 0.48 0.5371 0.188 0.608 0.103 0.689 0.0858 0.7587 0.0740 skewness 0.0167 0.0093-0.0115-0.035-0.0599-0.097 k 0.0000 0.0644 0.0949 0.1185 0.1449 0.1730 log-likelihood 105.0588 105.0617 105.047 105.00 104.898 104.707 BIC(k 0 ) -10.13-10.371-10.558-10.67-10.54

Dependent variable: Log output OLS Table 3: Estimates of Airlines Example: Constrained COLS Constrained MLE k 0 =0.1 k 0 =0.05 k 0 =0.1 k 0 =0.15 k 0 =0. Variables estimates se estimates se estimates se estimates se estimates se estimates se (1) () (3) (4) (5) (6) (7) (8) (9) (10) (11) (1) (13) Log fuel 0.388 0.071 0.3879 0.0697 0.3860 0.0701 0.3883 0.0701 0.391 0.070 0.3947 0.0703 Log materials 0.719 0.0773 0.708 0.0685 0.7098 0.069 0.709 0.069 0.6940 0.069 0.6835 0.0693 Log equipment 0.19 0.0739 0.06 0.079 0.197 0.0737 0.00 0.0738 0.04 0.0738 0.10 0.0739 Log labor -0.4101 0.0645-0.4178 0.0618-0.4145 0.065-0.4177 0.065-0.418 0.065-0.466 0.066 Log property 0.1880 0.098 0.199 0.0184 0.1944 0.0185 0.1991 0.0186 0.05 0.0186 0.13 0.0186 constant -0.9105-0.8361 0.0959-0.8619-0.8417-0.858-0.811 σ v 0.1608 0.1508 0.0196 0.1567 0.155 0.1481 0.1437 σ u 0.0000 0.0917 0.0848 0.0604 0.0853 0.1046 0.11 σ 0.059 0.0311 0.0103 0.08 0.0305 0.039 0.0353 λ 0.0000 0.608 0.103 0.3854 0.5593 0.7064 0.8436 skewness 0.0167-0.035-0.0115-0.035-0.0599-0.097 k 0.0000 0.1185 0.051 0.101 0.1535 0.055 SSR 6.596 6.598 6.60 6.611 6.65 N*C (k 0 ) 6.596 6.591 6.588 6.589 6.597 3

References [1] Aigner, D.J., C.A.K. Lovell, and P. Schmidt, 1977, Formulation and Estimation of Stochastic Frontier Production Function Models, Journal of Econometrics 6, 1-37. [] Amemiya, T., 1985. Advanced Econometrics, Cambridge, MA: Harvard University Press. [3] Almanidis, P. and R. C. Sickles, 011, The Skewness Issue in Stochastic Frontier Models: Fact of Fiction? In I. van Keilegom and P. W. Wilson (Eds.), Exploring Research Frontiers in Contemporary Statistics and Econometrics. Springer Verlag, Berlin Heidelberg. [4] Almanidis, P., J. Qian and R. Sickles, 014, Stochastic Frontier with Bounded E ciency, In Festschrift in Honor of Peter Schmidt: Econometric Methods and Applications. RC Sickles and WC Horrace (eds.) Springer Science & Business Media, New York, NY, (014): 47-81.. [5] Badunenko, O., D. Henderson and S. Kumbhakar, 01, When, Where and How to Perform E ciency Estimation, Journal of the Royal Statistical Society, Series A, 175, 863-89. [6] Bai, J. and S. Ng, 00. Determining the Number of Factors in Approximate Factor Models, Econometrica, 70, 191-1. [7] Carree, M., 00, Technological ine ciency and the skewness of the error component in stochastic frontier analysis, Economics Letters 77, 101-107. [8] Coelli, T., 1995, Estimators and Hypothesis Tests for a Stochastic Frontier Function: A Monte Carlo Analysis, Journal of Productivity Analysis, 6, 47-68. [9] Coelli, T., 1996, A guide to frontier version 4.1: A computer program for stochastic frontier production and cost function estimation, CEPA working paper #96/07, Centre for E ciency and Productivity Analysis, University of New England, Arimidale, NSW 351, Australia. [10] Green, A. and Mayes, D., 1991, Technical Ine ciency in Manufacturing Industries, Economic Journal 101, 53-538. [11] Greene, W., 1980, On the Estimation of a Flexible Frontier Production Model, Journal of Econometrics, 3, 101-115. [1] Greene, W., 1995, LIMDEP Version 7.0 User s Manual, New York: Econometric Software, Inc. 4