Window Width Selection for L 2 Adjusted Quantile Regression

Similar documents
Lasso and Ridge Quantile Regression using Cross Validation to Estimate Extreme Rainfall

PARAMETRIC AND NON-PARAMETRIC BOOTSTRAP: A SIMULATION STUDY FOR A LINEAR REGRESSION WITH RESIDUALS FROM A MIXTURE OF LAPLACE DISTRIBUTIONS

Wage Determinants Analysis by Quantile Regression Tree

Consistent estimators for multilevel generalised linear models using an iterated bootstrap

Quantile Regression due to Skewness. and Outliers

SELECTION OF VARIABLES INFLUENCING IRAQI BANKS DEPOSITS BY USING NEW BAYESIAN LASSO QUANTILE REGRESSION

Lecture 5: Fundamentals of Statistical Analysis and Distributions Derived from Normal Distributions

On Performance of Confidence Interval Estimate of Mean for Skewed Populations: Evidence from Examples and Simulations

Journal of Economics and Financial Analysis, Vol:1, No:1 (2017) 1-13

discussion Papers Some Flexible Parametric Models for Partially Adaptive Estimators of Econometric Models

Robust Critical Values for the Jarque-bera Test for Normality

Cross-Sectional Distribution of GARCH Coefficients across S&P 500 Constituents : Time-Variation over the Period

Internet Appendix for Asymmetry in Stock Comovements: An Entropy Approach

WC-5 Just How Credible Is That Employer? Exploring GLMs and Multilevel Modeling for NCCI s Excess Loss Factor Methodology

Mixed models in R using the lme4 package Part 3: Inference based on profiled deviance

An Improved Skewness Measure

Five Things You Should Know About Quantile Regression

Analysis of truncated data with application to the operational risk estimation

A RIDGE REGRESSION ESTIMATION APPROACH WHEN MULTICOLLINEARITY IS PRESENT

Power of t-test for Simple Linear Regression Model with Non-normal Error Distribution: A Quantile Function Distribution Approach

Modelling catastrophic risk in international equity markets: An extreme value approach. JOHN COTTER University College Dublin

Random Variables and Probability Distributions

Subject CS1 Actuarial Statistics 1 Core Principles. Syllabus. for the 2019 exams. 1 June 2018

Data Analysis and Statistical Methods Statistics 651

The histogram should resemble the uniform density, the mean should be close to 0.5, and the standard deviation should be close to 1/ 12 =

Quantile Curves without Crossing

The Great Moderation Flattens Fat Tails: Disappearing Leptokurtosis

REINSURANCE RATE-MAKING WITH PARAMETRIC AND NON-PARAMETRIC MODELS

Approximate Variance-Stabilizing Transformations for Gene-Expression Microarray Data

Which GARCH Model for Option Valuation? By Peter Christoffersen and Kris Jacobs

Limit Theorems for the Empirical Distribution Function of Scaled Increments of Itô Semimartingales at high frequencies

Equity, Vacancy, and Time to Sale in Real Estate.

Application of Soft-Computing Techniques in Accident Compensation

Chapter 7. Inferences about Population Variances

Estimation of a parametric function associated with the lognormal distribution 1

Effects of skewness and kurtosis on model selection criteria

Time Invariant and Time Varying Inefficiency: Airlines Panel Data

Online Appendix to Grouped Coefficients to Reduce Bias in Heterogeneous Dynamic Panel Models with Small T

Smooth estimation of yield curves by Laguerre functions

Forecasting the distribution of economic variables in a data-rich environment

U n i ve rs i t y of He idelberg

**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:

Gamma Distribution Fitting

Monte Carlo Simulation (General Simulation Models)

Introduction to Algorithmic Trading Strategies Lecture 8

Stochastic model of flow duration curves for selected rivers in Bangladesh

A Comparative Study of Various Forecasting Techniques in Predicting. BSE S&P Sensex

Omitted Variables Bias in Regime-Switching Models with Slope-Constrained Estimators: Evidence from Monte Carlo Simulations

10/1/2012. PSY 511: Advanced Statistics for Psychological and Behavioral Research 1

Measuring Financial Risk using Extreme Value Theory: evidence from Pakistan

The Application of the Theory of Power Law Distributions to U.S. Wealth Accumulation INTRODUCTION DATA

DRAFT. Half-Mack Stochastic Reserving. Frank Cuypers, Simone Dalessi. July 2013

Leasing and Debt in Agriculture: A Quantile Regression Approach

Huber smooth M-estimator. Mâra Vçliòa, Jânis Valeinis. University of Latvia. Sigulda,

Testing Out-of-Sample Portfolio Performance

Case Study: Heavy-Tailed Distribution and Reinsurance Rate-making

Linda Allen, Jacob Boudoukh and Anthony Saunders, Understanding Market, Credit and Operational Risk: The Value at Risk Approach

MODELLING OF INCOME AND WAGE DISTRIBUTION USING THE METHOD OF L-MOMENTS OF PARAMETER ESTIMATION

Objective Bayesian Analysis for Heteroscedastic Regression

Quantitative Risk Management

Bloomberg. Portfolio Value-at-Risk. Sridhar Gollamudi & Bryan Weber. September 22, Version 1.0

Business Statistics 41000: Probability 3

FINANCIAL ECONOMETRICS AND EMPIRICAL FINANCE MODULE 2

MM and ML for a sample of n = 30 from Gamma(3,2) ===============================================

IMPLEMENTING THE SPECTRAL CALIBRATION OF EXPONENTIAL LÉVY MODELS

Asymmetric Price Transmission: A Copula Approach

Model Construction & Forecast Based Portfolio Allocation:

Chapter 8 Statistical Intervals for a Single Sample

Monte Carlo Simulation (Random Number Generation)

Spline Methods for Extracting Interest Rate Curves from Coupon Bond Prices

Diploma in Business Administration Part 2. Quantitative Methods. Examiner s Suggested Answers

Lecture Note: Analysis of Financial Time Series Spring 2017, Ruey S. Tsay

Final Exam Suggested Solutions

Market risk measurement in practice

Application of Conditional Autoregressive Value at Risk Model to Kenyan Stocks: A Comparative Study

Chapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi

Jaime Frade Dr. Niu Interest rate modeling

Bayesian Non-linear Quantile Regression with Application in Decline Curve Analysis for Petroleum Reservoirs.

Financial Time Series and Their Characterictics

Modelling Environmental Extremes

FE670 Algorithmic Trading Strategies. Stevens Institute of Technology

Inflation Regimes and Monetary Policy Surprises in the EU

Statistical Modeling Techniques for Reserve Ranges: A Simulation Approach

EVA Tutorial #1 BLOCK MAXIMA APPROACH IN HYDROLOGIC/CLIMATE APPLICATIONS. Rick Katz

9. Logit and Probit Models For Dichotomous Data

IEOR E4602: Quantitative Risk Management

Sample Size for Assessing Agreement between Two Methods of Measurement by Bland Altman Method

Overnight Index Rate: Model, calibration and simulation

A Robust Test for Normality

Web Appendix. Are the effects of monetary policy shocks big or small? Olivier Coibion

Modelling Environmental Extremes

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.

Sampling Distributions

The Persistent Effect of Temporary Affirmative Action: Online Appendix

LESSON 7 INTERVAL ESTIMATION SAMIE L.S. LY

ROBUST OPTIMIZATION OF MULTI-PERIOD PRODUCTION PLANNING UNDER DEMAND UNCERTAINTY. A. Ben-Tal, B. Golany and M. Rozenblit

Fat tails and 4th Moments: Practical Problems of Variance Estimation

Yafu Zhao Department of Economics East Carolina University M.S. Research Paper. Abstract

Properties of the estimated five-factor model

1. You are given the following information about a stationary AR(2) model:

Transcription:

Window Width Selection for L 2 Adjusted Quantile Regression Yoonsuh Jung, The Ohio State University Steven N. MacEachern, The Ohio State University Yoonkyung Lee, The Ohio State University Technical Report No. 835 April, 2010 Department of Statistics The Ohio State University 1958 Neil Avenue Columbus, OH 43210-1247

Abstract Quantile regression provides estimates of a range of conditional quantiles. This stands in contrast to traditional regression techniques, which focus on a single conditional mean function. Quantile regression in the finite sample setting can be made more efficient and robust by rounding the sharp corner of the loss. The main modification generally involves an asymmetric l 2 adjustment of the loss function around zero. The resulting modified loss has qualitatively the same shape as Huber s loss when estimating a conditional median. To achieve consistency in the large sample case, the range of l 2 adjustment is controlled by a sequence which decays to zero as the sample size increases. Through extensive simulations, a rule is established to decide the range of modification. The simulation studies reveal excellent finite sample performance of modified regression quantiles guided by the rule. KEYWORDS: Case indicator; check loss function; penalization method; quantile regression 1 Introduction Quantile regression has emerged as a useful tool for providing estimates of conditional quantiles of a response variable Y given values of a predictor X. It allows us to estimate not only the center but also the upper and lower tails of the conditional distribution of interest. Due to its ability to capture full distributional aspects, rather than only the conditional mean, quantile regression has been widely applied. Koenker & Bassett (1978) and Bassett & Koenker (1978) consolidate a foundation for quantile regression. This foundation is extended to non-iid residuals in the linear model by He (1997) and Koenker & Zhao (1994). The loss function that defines quantile regression is called the check loss. The check loss has an asymmetric v-shape and becomes symmetric for the median. Lee, MacEachern & Jung (2007) introduced a new version of quantile regression where the check loss function is adjusted by an asymmetric l 2 penalty to produce a more efficient quantile estimator. Initially, the modification of the loss function arises from including case-specific parameters in the model. An additional penalty for the case specific parameters creates an adjustment of the check loss function over an interval. See Lee et al. (2007) for more details. The purpose of this paper is to provide a rule for determining the length of the interval of adjustment in the check loss function. To obtain a consistent estimator, the modification must vanish as the sample size grows. A brief theoretical review of l 2 adjusted quantile regression is given in Section 2. In Section 3, extensive simulations are performed to develop a rule which will provide guidance on implementation of the modified procedure. The performance of the rule is demonstrated in Section 4 through simulation and real data. Discussion and potential extensions appear in Section 5. 1

2 Overview of l 2 Adjusted Quantile Regression To estimate the qth regression quantile, the check loss function ρ q is employed: { qr for r 0 ρ q (r) = (1 q)r for r < 0. (1) We first consider a linear model of the form y i = x i β ϵ i, where the ϵ i s are iid from some distribution with qth quantile equal to zero. The quantile regression estimator ˆβ is the minimizer of n L(β) = ρ q (y i x i β). (2) i=1 To treat the observations in a systematic fashion, Lee et al. (2007) introduce case-specific parameters γ i which change the linear model to y i = x i β γ i ϵ i. From the fact that this is a super-saturated model, γ = (γ 1,..., γ n ) should be penalized. Together with the case-specific parameters and an additional penalty for γ, the objective function to minimize given in (2) is modified to be L(β, γ) = n i=1 ρ q (y i x i β γ i ) λ γ J(γ), (3) 2 where J(γ) is the penalty for γ and λ γ is a penalty parameter. Since the check loss function is piecewise linear, the quantile regression estimator is inherently robust. For improving efficiency, an l 2 type penalty for the γ is considered. As detailed in Lee et al. (2007), desired invariance suggests an asymmetric l 2 penalty of the form J(γ i ) := {q/(1 q)}γi 2 {(1 q)/q}γi. 2 With the J(γ i ), let us examine the minimizing values of the γ i, given β. First, note that min γ L( ˆβ, γ) decouples to minimization over individual γ i. Hence, given ˆβ and a residual r i = y i x i ˆβ, ˆγ i is now defined to be and is explicitly given by arg min L λγ ( ˆβ, γ i ) := ρ q (r i γ i ) λ γ γ i 2 J(γ i), (4) q λ γ I ( r i < q λ γ ) ri I ( q λ γ r i < 1 q λ γ ) 1 q I ( r i 1 q ). λ γ λ γ Plugging ˆγ in (4) produces the l 2 adjusted check loss, (q 1)r q(1 q) 2λ γ for r < q λ γ λ γ 1 q ρ γ 2 q q (r) = r2 for q λ γ r < 0 λ γ q 2 1 q r2 for 0 r < 1 q λ γ qr q(1 q) 2λ γ for r 1 q λ γ. (5) 2

In other words, l 2 adjusted quantile regression finds β that minimizes L λγ (β) = n i=1 ργ q (y i x i β). Note that the modified check loss is continuous and differentiable everywhere. The interval of quadratic adjustment is ( q/λ γ, (1 q)/λ γ ), and we refer to the length of this interval 1/λ γ as the window width. When the λ γ is properly chosen, the modified procedure will enjoy its advantage to the full. The next section addresses how to set a good rule for selection of λ γ. 3 Simulation Study To develop a rule and obtain a consistent estimator, we first consider λ γ of the form λ γ := c q n α /ˆσ where c q is a constant depending on q, n is the sample size, α is a positive constant, and ˆσ is a robust scale estimate of the error distribution. Theorem 2 in Lee et al. (2007) suggests that for α > 1/3, the modified quantile regression is asymptotically equivalent to the standard quantile regression. However, for optimal finite sample performance, we will consider a range of α values. We use 1.4826 MAD (Median Absolute Deviation) as a robust scale estimator ˆσ. The form of the rule suggests that c q should be scale invariant and depend only on the targeted quantile q. In this section, choice of the window width will be investigated by simulation. Throughout the simulation, the linear model y i = β 0 x i β ϵ i is assumed. Following the simulation setting in Tibshirani (1996), x = (x 1,..., x 8 ) is generated from a multivariate normal distribution with mean (0,..., 0) and variance Σ, where σ ij = ρ i j with ρ = 0.5. The true coefficient vector β is taken to be (3, 1.5, 0, 0, 2, 0, 0, 0). Various distributions are considered for ϵ i, including normal, t, shifted log-normal, shifted gamma, and shifted exponential error distribution. In each distribution, ϵ i is assumed to be iid with median zero and variance 9 (except when the ϵ i follows the standard normal distribution). For the t distributions, 2.25, 5, and 10 degrees of freedom are used, maintaining a variance of 9. Several values of α were tried. After examining the results, a decision was made to set α equal to 0.3. This makes α to be independent of sample size. Thus we search only for c q. Sample sizes range from 10 2 to 10 4, and various quantiles from 0.1 to 0.9 are considered. To gauge the performance of l 2 adjusted quantile regression with λ γ, define mean squared error () of the estimated quantile X ˆβ ˆβ0 at a new X as = E ˆβ,X (X ˆβ ˆβ0 ) (X β β 0 ) 2 = E ˆβ,X {( ˆβ β) X X( ˆβ β) ( ˆβ 0 β 0 ) 2 } = E ˆβ{( ˆβ β) Σ( ˆβ β) ( ˆβ 0 β 0 ) 2 }. (6) is integrated across the distribution of a future X. The distribution of the future X is normal with mean (0,..., 0) and variance Σ. In the simulation, is approximated by a Monte Carlo estimate over 500 replicates, = 500 1 500 i=1 (( ˆβ i β) Σ( ˆβ i β) ( ˆβ 0 i β 0 ) 2 ), where ˆβ i and ˆβ 0 i are the estimates of β and the intercept β 0 for the i th replicate, respectively. With fixed α, the window width (ˆσ/(c q n α )) is a function of the constant c q only. Thus by varying c q, an optimal window width which provides the smallest can 3

be obtained. The optimal window widths, found by a grid search, are shown in Figure 1 for various error distributions. Each panel of Figure 2 shows a typical shape of the curve as a function of window width. In general, values begin to decrease as we increase the window width from zero until it hits its minimum, and increase thereafter due to an increase in bias. However, when estimating the median with normally distributed errors, M SE decreases as the window width increases. This is not surprising, given the optimality properties of least squares regression for normal theory regression. The comparisons between sample mean and sample median can be explicitly found under the t error distributions using different degrees of freedom. The benefit of the median relative to the mean is greater for thicker tailed distributions. We observe that this qualitative behavior carries over to the optimal window width. Thicker tails lead to shorter optimal windows, as shown in Figure 1. 3.1 Development of a Rule Under each error distribution mentioned above, the optimal constants which yield smallest are found at the quantiles 0.1, 0.2,..., 0.9. First, omitting the median, log of the optimal constant log(c q ) from the standard normal error is regressed on q to suggest a possible relationship. A significant linear relationship exists. The fitted values from this regression were used to produce values for c q. These values were then applied to the other error distributions. However, the rule obtained from the normal distribution led to poor M SE values when applied to skewed error distributions. This is due to the overestimation of the window width or equivalently, underestimation of c q near the median. As we can see in Figure 2, too large a window may lead to a huge. As an alternative, another rule expressing the relationship between the optimal log(c q ) and q was developed from the exponential error distribution. The top left plot in Figure 3 shows the relationship between optimal log(c q ) and q. Before fitting a linear model of log(c q ) = β 0 β 1 q ϵ, q greater than 0.5 were converted to 1 q, since it was judged desirable to have a rule which will work well for symmetric distributions. The solid line in the top right plot of Figure 3 is the fitted line using all observations, whereas the dashed line is from only observations with q 0.5, excluding observations with mark. The dashed line is accepted as a final rule. The final rule is compared to the other rules from normal, t, log-normal, and gamma distributions. In Figure 3, the solid lines in the second and third rows represent optimal rules from each distribution mentioned above (developed on quantiles 0.5) whereas the dashed line is the final rule. Numerical expression of the final rule is given by c q { 0.5e 2.118 1.097q for q < 0.5 0.5e 2.118 1.097(1 q) for q 0.5, (7) where q stands for the qth quantile. Under various error distributions, the estimated c q from the rule (7) is employed to gauge its prediction performance. Specifically, M SE values for quantile regression (), modified 4

N(0,1) t(df=2.25) t(df=10) f(x) 0.2 0.1 0.0 0.1 0.2 0.3 0.4 f(x) 0.2 0.1 0.0 0.1 0.2 0.3 0.4 f(x) 0.2 0.1 0.0 0.1 0.2 0.3 0.4 3 2 1 0 1 2 3 q=(0.1,0.2,0.3,0.4,0.5) 3 2 1 0 1 2 3 q=(0.1,0.2,0.3,0.4,0.5) 2 1 0 1 2 q=(0.1,0.2,0.3,0.4,0.5) Gamma(shape=3,scale=sqrt(3)) Log Normal Exp(3) f(x) 0.3 0.2 0.1 0.0 0.1 f(x) 0.6 0.4 0.2 0.0 0.2 0.4 0.6 f(x) 0.3 0.2 0.1 0.0 0.1 0.2 0.3 0 2 4 6 8 10 q=(0.1,0.2,0.3,0.4,0.5,0.6,0.7) 0.0 0.5 1.0 1.5 2.0 q=(0.1,0.2,0.3,0.4,0.5,0.6,0.7) 0 1 2 3 4 5 q=(0.1,0.2,0.3,0.4,0.5,0.6,0.7) Figure 1: Optimal intervals of adjustment for different quantiles (q), sample sizes (n) and error distributions. The vertical lines in each distribution indicate the true quantiles. The stacked horizontal lines for each quantile are corresponding optimal intervals. The five intervals at each quantile are for n= 10 2, 10 2.5, 10 3, 10 3.5 and 10 4. 5

1.4 1.6 1.8 2.0 2.2 2.4 0 5 10 15 Window Width 0 5 10 15 Window Width Figure 2: M SE values evaluated at one hundred points marked with and connected by a smoothing spline. The smallest and largest window widths in each plot correspond to the window width approximately 5% and 98% of data in it, respectively. The residual distribution is the t (df=10) distribution, sample sizes are 10 2 (left panel) and 10 3 (right panel), and the 0.2 quantile is estimated. The horizontal lines represent the M SE values from the standard quantile regression. quantile regression with optimal c q (), and modified quantile regression with c q chosen by the final rule () are compared. Figures 6 through 11 show the behavior of,, and in terms of M SE. Overall, handily outperforms standard quantile regression. Surprisingly enough, the version of finite sample performance for this modified quantile regression is often nearly optimal. This near-optimality extends across a range of residual distributions. In practice, the robust linear modeling procedure, rlm(mass) in R package is ready to be utilized. Equipped with the derivative of (5), the modified estimators can be obtained from the rlm function by specifying q and the corresponding rule c q. Since the rlm function internally uses re-scaled MAD for the method of scale estimation, the estimate of the scale parameter in λ γ is automatically obtained. 4 Application to Engel s Data Engel s data consists of the household food expenditure and household income from 235 European working-class households in the 19th century. Taking the log of food expenditure as a response variable, we investigate the relation between log of food expenditure and log of household income. In Figure 4, Engel s data is plotted after transformation of both variables. Superimposed on the scatter plot are the fitted lines from quantile regression (), and modified quantile regression () using the rule developed in Section 3. Although the two methods display quite similar fitted lines, Figure 5 reveals the difference between 6

Exp(3) Exp(3) log(cq) 5 4 3 2 log(cq) 5 4 3 2 N(0,1) t(df=2.25) log(cq) 5 4 3 2 log(cq) 5 4 3 2 Log Normal Gamma(shape=3,scale=sqrt(3)) log(cq) 5 4 3 2 log(cq) 5 4 3 2 Figure 3: Top left: Relationship between optimal log(c q ) and quantile from the exponential distribution. Top right: Left plot is folded in half at q = 0.5. Circles with a mark are from the left fold (quantile < 0.5) and the others are from the right fold (quantiles 0.5). The solid line is the fitted line using all observations whereas the dashed line excludes observations with a mark (final rule). Solid lines in the middle and bottom plots are the rules corresponding to normal, t, log-normal, and gamma distributions compared to the final rule (dashed line). 7

and. We note that these fitted lines from modified quantile regression do not across over the range of log(household income) in the data. This is partly due to the averaging effect of the l 2 adjustment to the check loss function. log(food Expenditure) 5.5 6.0 6.5 7.0 7.5 6.0 6.5 7.0 7.5 8.0 8.5 log(household Income) Figure 4: Superimposed on the scatter plot are the 0.05, 0.1, 0.25, 0.5, 0.75, 0.90, 0.95 standard quantile regression (solid, blue) lines, and modified quantile regression (dashed, red) lines for Engel s data after log transformation of both response and predictor variables. 5 Conclusion We have shown how case-specific indicators can be utilized in the context of quantile regression through regularization of their parameters. The simulation studies suggest a simple rule to select the regularization parameter for the case-specific parameters. The behavior of the newly developed rule is excellent under both symmetric and asymmetric error distributions at any conditional quantile, regardless of the sample size. The analysis of Engel s data also reveals that the modified procedure is less prone to crossing estimates of quantiles than is quantile regression (this is confirmed in further investigation not presented here). For large sample behavior, details of theoretical results and conditions regarding consistency properties are given in Lee et al. (2007). In terms of computation, modified quantile regression requires only slight adjustment to existing software. The simulated and real data analyses have shown the potential of l 2 adjusted quantile regression and the rule for selecting the window width. Finally, we wish to point out a possible direction where our research can be extended. As Koenker & Zhao (1994) and Koenker (2005) considered heteroscedastic models in quantile regression, the scope of our modified quantile regression procedure can 8

Residual from Median fit 0.4 0.2 0.0 0.2 Residual from Median fit 0.4 0.2 0.0 0.2 6.0 6.5 7.0 7.5 8.0 8.5 log(household Income) 6.0 6.5 7.0 7.5 8.0 8.5 log(household Income) Difference from fitted Median 0.4 0.2 0.0 0.2 Difference from fitted Median 0.4 0.2 0.0 0.2 6.0 6.5 7.0 7.5 8.0 8.5 log(household Income) 6.0 6.5 7.0 7.5 8.0 8.5 log(household Income) Figure 5: Top: Residuals from a median fit via and. Bottom: Differences between fitted median line and the fitted quantiles at q=0.05, 0.1, 0.25, 0.5, 0.75, 0.9, 0.95. 9

be expanded to include non-iid error models. References Bassett, G. & Koenker, R. (1978). Asymptotic theory of least absolute error regression, Journal of the American Statistical Association 73(363): 618 622. He, X. (1997). Quantile curves without crossing, The American Statistician 51(2): 186 192. Koenker, R. (2005). Quantile Regression, Cambridge U. Press. Koenker, R. & Bassett, G. (1978). Regression quantiles, Econometrica 46(1): 33 50. Koenker, R. & Zhao, Q. (1994). L-estimation for linear heteroscedastic models, Journal of Nonparametric Statistics 3(3): 223 235. Lee, Y., MacEachern, S. N. & Jung, Y. (2007). Regularization of case-specific parameters for robustness and efficiency, Technical Report No.799, The Ohio State University. Tibshirani, R. (1996). Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society, Series B 58(1): 267 288. N(0,1), n=10 2 N(0,1), n=10 3 N(0,1), n=10 4 0.10 0.15 0.20 0.25 0.30 0.010 0.015 0.020 0.025 0.0010 0.0015 0.0020 0.0025 Figure 6: M SE values from quantile regression (), modified quantile regression with optimal window width (), and modified quantile regression using the rule () under a standard normal error distribution. 10

t(df=2.25), n=10 2 t(df=2.25), n=10 3 t(df=2.25), n=10 4 0.0 0.5 1.0 1.5 0.00 0.04 0.08 0.12 0.000 0.004 0.008 0.012 Figure 7: M SE values from quantile regression (), modified quantile regression with optimal window width (), and modified quantile regression using the rule () under t (df=2.25) error distribution. t(df=10), n=10 2 t(df=10), n=10 3 t(df=10), n=10 4 1.0 1.5 2.0 2.5 3.0 0.10 0.15 0.20 0.25 0.010 0.015 0.020 0.025 Figure 8: M SE values from quantile regression (), modified quantile regression with optimal window width (), and modified quantile regression using the rule () under a t (df=10) error distribution. 11

Gamma(shape=3,scale= 3), n=10 2 Gamma(shape=3,scale= 3), n=10 3 Gamma(shape=3,scale= 3), n=10 4 1 2 3 4 5 0.01 0.02 0.03 0.04 0.05 Figure 9: M SE values from quantile regression (), modified quantile regression with optimal window width (), and modified quantile regression using the rule () under a gamma (3, 3) error distribution. Log Normal, n=10 2 Log Normal, n=10 3 Log Normal, n=10 4 0 1 2 3 4 5 6 0.0 0.00 0.01 0.02 0.03 0.04 0.05 0.06 Figure 10: M SE values from quantile regression (), modified quantile regression with optimal window width (), and modified quantile regression using the rule () under a log-normal error distribution. 12

Exp(3), n=10 2 Exp(3), n=10 3 Exp(3), n=10 4 0 1 2 3 4 5 6 0.0 0.2 0.4 0.6 0.00 0.02 0.04 0.06 Figure 11: M SE values from quantile regression (), modified quantile regression with optimal window width (), and modified quantile regression using the rule () under an exponential (3) error distribution. 13