ASYMPTOTIC MEAN SQUARE ERRORS OF VARIANCE ESTIMATORS FOR U-STATISTICS AND THEIR EDGEWORTH EXPANSIONS

Similar documents
5. Best Unbiased Estimators

Rafa l Kulik and Marc Raimondo. University of Ottawa and University of Sydney. Supplementary material

SELECTING THE NUMBER OF CHANGE-POINTS IN SEGMENTED LINE REGRESSION

Lecture 9: The law of large numbers and central limit theorem

Bayes Estimator for Coefficient of Variation and Inverse Coefficient of Variation for the Normal Distribution

5 Statistical Inference

Parametric Density Estimation: Maximum Likelihood Estimation

14.30 Introduction to Statistical Methods in Economics Spring 2009

x satisfying all regularity conditions. Then

point estimator a random variable (like P or X) whose values are used to estimate a population parameter

Asymptotics: Consistency and Delta Method

ECON 5350 Class Notes Maximum Likelihood Estimation

A random variable is a variable whose value is a numerical outcome of a random phenomenon.

EXERCISE - BINOMIAL THEOREM

1 Random Variables and Key Statistics

Sequences and Series

Solutions to Problem Sheet 1

Maximum Empirical Likelihood Estimation (MELE)

Monetary Economics: Problem Set #5 Solutions

Introduction to Probability and Statistics Chapter 7

Combining imperfect data, and an introduction to data assimilation Ross Bannister, NCEO, September 2010

The Limit of a Sequence (Brief Summary) 1

r i = a i + b i f b i = Cov[r i, f] The only parameters to be estimated for this model are a i 's, b i 's, σe 2 i

FINM6900 Finance Theory How Is Asymmetric Information Reflected in Asset Prices?

Lecture 5 Point Es/mator and Sampling Distribu/on

Statistics for Economics & Business

A Bayesian perspective on estimating mean, variance, and standard-deviation from data

Summary. Recap. Last Lecture. .1 If you know MLE of θ, can you also know MLE of τ(θ) for any function τ?

An Empirical Study of the Behaviour of the Sample Kurtosis in Samples from Symmetric Stable Distributions

NOTES ON ESTIMATION AND CONFIDENCE INTERVALS. 1. Estimation

Estimation of Population Variance Utilizing Auxiliary Information

Math 312, Intro. to Real Analysis: Homework #4 Solutions

Discriminating Between The Log-normal and Gamma Distributions

1 Estimating sensitivities

STAT 135 Solutions to Homework 3: 30 points

A New Constructive Proof of Graham's Theorem and More New Classes of Functionally Complete Functions

Exam 1 Spring 2015 Statistics for Applications 3/5/2015

Online appendices from Counterparty Risk and Credit Value Adjustment a continuing challenge for global financial markets by Jon Gregory

0.1 Valuation Formula:

Chapter 10 - Lecture 2 The independent two sample t-test and. confidence interval

Exam 2. Instructor: Cynthia Rudin TA: Dimitrios Bisias. October 25, 2011

. (The calculated sample mean is symbolized by x.)

Unbiased estimators Estimators

An Improved Estimator of Population Variance using known Coefficient of Variation

Inferential Statistics and Probability a Holistic Approach. Inference Process. Inference Process. Chapter 8 Slides. Maurice Geraghty,

Lecture 4: Parameter Estimation and Confidence Intervals. GENOME 560 Doug Fowler, GS

18.S096 Problem Set 5 Fall 2013 Volatility Modeling Due Date: 10/29/2013

Topic 14: Maximum Likelihood Estimation

Hopscotch and Explicit difference method for solving Black-Scholes PDE

BOUNDS FOR TAIL PROBABILITIES OF MARTINGALES USING SKEWNESS AND KURTOSIS. January 2008

1 The Black-Scholes model

Research Article The Probability That a Measurement Falls within a Range of n Standard Deviations from an Estimate of the Mean

1 Basic Growth Models

Random Sequences Using the Divisor Pairs Function

SUPPLEMENTAL MATERIAL

Topic-7. Large Sample Estimation

NORMALIZATION OF BEURLING GENERALIZED PRIMES WITH RIEMANN HYPOTHESIS

Problem Set 1a - Oligopoly

Control Charts for Mean under Shrinkage Technique

Kernel Density Estimation. Let X be a random variable with continuous distribution F (x) and density f(x) = d

Simulation Efficiency and an Introduction to Variance Reduction Methods

AY Term 2 Mock Examination

Subject CT1 Financial Mathematics Core Technical Syllabus

Chapter 8. Confidence Interval Estimation. Copyright 2015, 2012, 2009 Pearson Education, Inc. Chapter 8, Slide 1

Sampling Distributions and Estimation

Supplement to Adaptive Estimation of High Dimensional Partially Linear Model

Confidence Intervals based on Absolute Deviation for Population Mean of a Positively Skewed Distribution

4.5 Generalized likelihood ratio test

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 2

Sampling Distributions & Estimators

Non-Inferiority Logrank Tests

Bootstrapping high-frequency jump tests

1. Suppose X is a variable that follows the normal distribution with known standard deviation σ = 0.3 but unknown mean µ.

Models of Asset Pricing

The Valuation of the Catastrophe Equity Puts with Jump Risks

ON DIFFERENTIATION AND HARMONIC NUMBERS

CreditRisk + Download document from CSFB web site:

FOUNDATION ACTED COURSE (FAC)

These characteristics are expressed in terms of statistical properties which are estimated from the sample data.

Models of Asset Pricing

Models of Asset Pricing

Supersedes: 1.3 This procedure assumes that the minimal conditions for applying ISO 3301:1975 have been met, but additional criteria can be used.

INTERVAL GAMES. and player 2 selects 1, then player 2 would give player 1 a payoff of, 1) = 0.

Fourier Transform in L p (R) Spaces, p 1

The material in this chapter is motivated by Experiment 9.

CHAPTER 8 Estimating with Confidence

Model checks for the volatility under microstructure noise

Stochastic Processes and their Applications in Financial Pricing

Bootstrapping high-frequency jump tests

Notes on Expected Revenue from Auctions

DESCRIPTION OF MATHEMATICAL MODELS USED IN RATING ACTIVITIES

Ecient estimation of log-normal means with application to pharmacokinetic data

CHANGE POINT TREND ANALYSIS OF GNI PER CAPITA IN SELECTED EUROPEAN COUNTRIES AND ISRAEL

Proceedings of the 5th WSEAS Int. Conf. on SIMULATION, MODELING AND OPTIMIZATION, Corfu, Greece, August 17-19, 2005 (pp )

Faculdade de Economia da Universidade de Coimbra

REVISIT OF STOCHASTIC MESH METHOD FOR PRICING AMERICAN OPTIONS. Guangwu Liu L. Jeff Hong

Research Article The Average Lower Connectivity of Graphs

Standard Deviations for Normal Sampling Distributions are: For proportions For means _

ST 305: Exam 2 Fall 2014

Calculation of the Annual Equivalent Rate (AER)

Transcription:

J. Japa Statist. Soc. Vol. 8 No. 1 1998 1 19 ASYMPTOTIC MEAN SQUARE ERRORS OF VARIANCE ESTIMATORS FOR U-STATISTICS AND THEIR EDGEWORTH EXPANSIONS Yoshihiko Maesoo* This paper studies variace estimators for a class of U-statistics. We obtai asymptotic represetatios of jackkife, Hikley s (1978) corrected jackkife, ubiased, Se s (1960) ad ew variace estimators. Ad we ivestigate asymptotic mea square errors of them, theoretically. The Edgeworth expasios of the estimators with remaider term o( 1 ) are also established. We show that the ormalized Hikley s corrected estimator coicides the ormalized ubiased estimator util the order 1/ o p ( 1 ). Key words ad phrases: Edgeworth expasios, estimatio of variace, jackkife estimator, mea square errors, U-statistics. 1. Itroductio Let X 1,, X be idepedetly ad idetically distributed radom vectors with distributio fuctio F (x). Let h(x 1,, x r ) be a real valued fuctio which is symmetric i its argumets. For r let us defie a U-statistic by U = ( ) 1 h(x i1,, X ir ) r C,r where C,r idicates that the summatio is take over all itegers i 1,, i r satisfyig 1 i 1 < < i r. U is a miimum variace ubiased estimator of θ = E[h(X 1,, X r )] ad may statistics i commo use are members of U-statistics or approximated by them. Several variace estimators for the U-statistic are proposed. Se (1960) has discussed a estimator of the domiat term r E[E{h(X 1,, X r ) X 1 } θ] of the variace σ = V ar(u ) i the case of degree ad Se (1977) exteded it to geeral degree r. He also proved the law of large umbers. The jackkife variace estimator ˆσ J is give by ˆσ J = 1 (U (i) U ) where U (i) deotes U-statistic computed from a sample of 1 poits with X i left out. The properties of ˆσ J are precisely studied. Arvese (1969) has obtaied the exact represetatio of ˆσ J, which is complicated, ad Efro ad Stei (1981) have showed that the jackkife variace estimator has positive bias. The bias reductio for the jackkife variace estimator has bee studied by Hikley (1978), ad Efro ad Stei (1981). Received Jue 1996. Revised February 4, 1998. Accepted November, 1997. *Faculty of Ecoomics, Kyushu Uiversity 7 Hakozaki 6-19-1, Higashi-ku, Fukuoka 81-81, Japa.

J. JAPAN STATIST. SOC. Vol.8 No.1 1998 I the case of small sample, usig computer simulatio, Schucay ad Bakso (1989) discuss biases ad mea square errors of Se s (1960) estimator, the jackkife estimator ad a ubiased estimator which is costituted from ubiased estimators of each term of the variace expressio. It is easy to see that all above estimators have first order cosistecy, which meas that the ormalized estimators coverge to the domiat term r ξ1 of the variace. Shirahata ad Sakamoto (199) have compared several estimators (ubiased estimator, jackkife estimator, bias modified estimator, ad iterated bootstrap ad bootstrap estimators) by computer simulatios. They have also discussed exact represetatios of the estimators ad reductio of the order of summads to compute the variace estimators. Usig the asymptotic represetatio of the jackkife variace estimator with the residual term o p ( 1/ ), Maesoo (1995b) has obtaied a Edgeworth expasio with remaider term o( 1/ ) for the studetized U-statistic. Obtaiig the asymptotic represetatio of the variace estimator with residual term o p ( 1 ), where P { o p ( 1 ) 1 (log ) 1 } = o( 1 ), Maesoo (1996a) has ivestigated the Edgeworth expasio of the studetized U-statistic with remaider term o( 1 ). He has also proved the Edgeworth expasio with remaider term o( 1/ ) for the jackkife variace estimator ˆσ J. Further, Maesoo (1996b) has discussed the expasio for a liear combiatio of U-statistics. I this paper we will study the variace estimators more precisely ad obtai asymptotic represetatios of the ormalized estimators with residual terms 1/ o p ( 1 ). We show that the ubiased estimator of Schucay ad Bakso (1989) coicides with the Hikley s (1978) corrected jackkife estimator util the order 1/ o p ( 1 ). Usig the asymptotic represetatios we obtai asymptotic mea square errors of the variace estimators. We also propose a ew variace estimator ad obtai its mea square error. We establish Edgeworth expasios of those variace estimators with remaider term o( 1 ). I Sectio, we will review the variace estimators ad propose the ew estimator. I Sectio 3, we will obtai the asymptotic represetatios of the estimators ad discuss the asymptotic mea square errors. The Edgeworth expasios of them are established i Sectio 4. Hereafter for the sake of simplicity, we will cosider the kerel of degree. The geeralizatio to the kerel with arbitrary degree will be obtaied with otatioal complicatios ad tedious calculatios.. Variace estimators At first we will obtai the H-decompositio or the ANOVA-decompositio for the U-statistic. Uder the assumptio that E h(x 1, X ) <, let us defie g 1 (x) = E[h(x, X )] θ, g (x, y) = h(x, y) θ g 1 (x) g 1 (y), A 1 = g 1 (X i ) ad A = g (X i, X j ). C, The we have Note that U θ = A 1 + E[g (X 1, X ) X 1 ] = 0 ( 1) A. a.s.

VARIANCE ESTIMATORS FOR U-STATISTICS 3 The if oe of {i 1, i } is ot cotaied i {j 1,, j m }, for ay m-variate fuctio ν which satisfies E νg <, we get (.1) E[g k (X i1, X i )ν(x j1,, X jm )] = 0. Usig this equatio we have the variace σ of U where σ = 4 ξ 1 + ( 1) ξ ξ 1 = E[g 1(X 1 )] ad ξ = E[g (X 1, X )]. Sice we discuss the asymptotic properties, we will study the estimatio of σ. The we cosider the jackkife variace estimator V J = ˆσ J. From the viewpoit of estimatio for 4ξ1, Se (1960, 1977) has proposed the variace estimator V S where Se (1977) also showed that V S = 4 1 S i = 1 1 (.) V S = (S i U ) j=1, i h(x i, X j ). ( ) ( 1) V J. As poited out by Efro (1987, p.00), chagig coefficiets of the estimators will have sigificatly differet effects o the small sample performace of the estimators. Sice V J has positive bias ad ( ) /( 1) = 1 / + O( ), we cosider the ew variace estimator V α give by V α = ( 1 α ) V J for α 0. Note that V ad V S are asymptotically equivalet ad V 0 = V J. If we choose α properly, we ca reduce the bias ad the mea square error, which we will discuss i Sectio 3. Hikley (1978) has discussed the bias correctio of V J. Let us defie Q i,j = U ( 1)(U (i) + U (j) ) + ( )U (i,j) where U (i,j) deotes the value of U whe X i ad X j are deleted from the sample. The the bias corrected jackkife estimator is give by V C = V J 1 (Q i,j + 1 Q) C, where Q = C, Q i,j /[( 1)]. Schucay ad Bakso (1989) proposed the ubiased estimator of σ, which is costituted from ubiased estimators of each term of the variace expressio. Aother variace expressio of σ is (.3) σ = 4( ) 1 a 1 + 1 a

4 J. JAPAN STATIST. SOC. Vol.8 No.1 1998 where Let us defie ad a 1 = E[h(X 1, X )h(x 1, X 3 )] θ ad a = E[h (X 1, X )] θ. ζ 0 (x 1, x, x 3, x 4 ) = 1 3 {h(x 1, x )h(x 3, x 4 ) + h(x 1, x 3 )h(x, x 4 ) + h(x 1, x 4 )h(x, x 3 )}, ζ 1 (x 1, x, x 3 ) = 1 3 {h(x 1, x )h(x 1, x 3 ) + h(x 1, x )h(x, x 3 ) + h(x 1, x 3 )h(x, x 3 )} ζ (x 1, x ) = h (x 1, x ). The ubiased estimators of θ, E[h(X 1, X )h(x 1, X 3 )] ad E[h (X 1, X )] are give by ( ) 1 ˆθ = ζ 0 (X i1,, X i4 ), 4 C,4 ( ) 1 ˆλ 1 = ζ 1 (X i1, X i, X i3 ) 3 C,3 ad ˆλ = ( ) 1 ζ (X i1, X i ) C, respectively. Substitutig â k = ˆλ k ˆθ for a k ubiased estimator V U of σ as i the equatio (.3), we obtai the V U = 4( ) â 1 1 + â 1. Schucay ad Bakso (1989) compared the estimators V J, V S ad V U by simulatio i small samples = 10. We ca see that all these estimators coverge to 4ξ1 almost surely. We will study the asymptotic properties of the estimator more precisely. 3. Asymptotic represetatios ad mea square errors Maesoo (1995a) has obtaied the asymptotic represetatios of the variace estimators V J, V S, V C ad V U with residual terms o p ( 1 ). Here we will cosider the asymptotic represetatios more precisely. Let us defie δ(x) = E[g (x, X )] ξ, f 1 (x) = g 1(x) ξ 1 + E[g 1 (X )g (x, X )], f (x, y) = g 1 (x)g 1 (y) + g (x, y){g 1 (x) + g 1 (y)} + E[g (x, X 3 )g (y, X 3 ) g (x, X 3 )g 1 (X 3 ) g (y, X 3 )g 1 (X 3 )], f 3 (x, y, z) = g (x, y)g (x, z) + g (x, y)g (y, z) + g (x, z)g (y, z) E[g (x, X 3 )g (y, X 3 ) + g (y, X 3 )g (z, X 3 ) + g (x, X 3 )g (z, X 3 )] {g 1 (x)g (y, z) + g 1 (y)g (x, z) + g 1 (z)g (x, y)

ad VARIANCE ESTIMATORS FOR U-STATISTICS 5 V = 4 8 f 1 (X i ) + f (X i, X j ) ( 1) C, 8 + f 3 (X i, X j, X k ). ( 1)( ) C,3 Note that V has already decomposed. For the variace estimators, we have the followig represetatios. (3.1) (3.) (3.3) (3.4) ad Theorem 1. If E h(x 1, X ) 4+ε < for some ε > 0, we have V J = V + 8 V S = V + 8 V α = V + 4 V C = V + 4 δ(x i ) + σ + b J + R 1;, {δ(x i ) f 1 (X i )} + σ + b S + R ;, {δ(x i ) αf 1 (X i )} + σ + b α + R 3;, δ(x i ) + σ + R 4; (3.5) V U = V + 4 δ(x i ) + σ + R 5; where ad b J = ξ, b S = ξ 8ξ 1, b α = ξ 4αξ 1 (3.6) E R k; + ε = O( 4 ε ) (k = 1,, 5). Proof. See appedix. b J, b S ad b α are 1 biases of the jackkife, the Se s estimator ad the ew estimator respectively. Sice R k, = 1/ o p ( 1 ), the ubiased estimator V U coicides the Hikley s (1978) corrected jackkife estimator V C util the order 1/ o p ( 1 ). It is easy to see that (3.7) E[f (X 1, X ) X 1 ] = E[f 3 (X 1, X, X 3 ) X 1, X ] = 0 a.s. ad E[f 1 (X 1 )] = E[δ(X 1 )] = 0. Usig the asymptotic represetatios of Theorem 1, we ca study the asymptotic properties of the variace estimators. Here we will obtai asymptotic mea square errors of V J, V S, V α, V C ad V U up to the order. Let us defie mse(v J ) = 16 E[f 1 (X 1 )] + 1 {b J + 64E[f 1 (X 1 )δ(x 1 )] + 3E[f (X 1, X )]}, mse(v S ) = 16 E[f 1 (X 1 )] + 1 {b S + 64E[f 1 (X 1 )(δ(x 1 ) f 1 (X 1 ))]

6 J. JAPAN STATIST. SOC. Vol.8 No.1 1998 + 3E[f (X 1, X )]}, mse(v α ) = 16 E[f 1 (X 1 )] + 1 {b α + 3E[f 1 (X 1 )(δ(x 1 ) αf 1 (X 1 ))] + 3E[f (X 1, X )]} ad mse(v C ) = 16 E[f 1 (X 1 )] + 1 {3E[f 1(X 1 )δ(x 1 )] + 3E[f (X 1, X )]}. Note that mse(v J ) = mse(v 0 ) ad mse(v S ) = mse(v ). We have the followig theorem. Theorem. If E h(x 1, X ) 4+ε < for some ε > 0, we have E(V J σ) = mse(v J ) + O( 5 ), E(V S σ) = mse(v S ) + O( 5 ), E(V α σ) = mse(v α ) + O( 5 ), E(V C σ) = mse(v C ) + O( 5 ) ad E(V U σ ) = mse(v C ) + O( 5 ). Proof. It follows from (3.6) ad (A.) i Lemma 1 (see Appedix) that uder the momet coditio, for 1 k 5, ad E 1 R k; E R k; f 1 (X i ) {E 1 f 1 (X i ) = O( 5 ), δ(x i ) {E δ(x i ) + ε + ε E R k; + ε E R k; + ε = O( 3 ), { E R k; f (X i, X j ) E f (X i, X j ) C, C, = O( 3 ) + ε E R k; {E R k; + ε } 4 4+ε = O( 4 ). Thus, usig these equatios ad (3.7), we ca obtai the equalities. } 4+ε } 4+ε E R k; + ε } 4+ε Remark 1. It is possible to improve the equatios with remaider terms of the order O( 3 ). But it eeds more calculatio, the we leave the equatios as they are. Let us defie e 1 = E[g 4 1(X 1 )], e = E[g 1(X 1 )g (X 1, X )], e 3 = E[g 1(X 1 )g 1 (X )g (X 1, X )], e 4 = E[g 1 (X 1 )g 1 (X )g (X 1, X )], e 5 = E[g 1 (X 1 )g 1 (X )g (X 1, X 3 )g (X, X 3 )], e 6 = E[g 1 (X 1 )g (X 1, X )g (X 1, X 3 )g (X, X 3 )]

ad VARIANCE ESTIMATORS FOR U-STATISTICS 7 e 7 = E[g (X 1, X 3 )g (X, X 3 )g (X 1, X 4 )g (X, X 4 )]. The, usig the equatio (.1), it follows from direct computatios that ad E[f 1 (X 1 )] = e 1 ξ 4 1 + 4e 3 + 4e 5, E[f 1 (X 1 )δ(x 1 )] = e ξ 1ξ + e 3 E[f (X 1, X )] = ξ 4 1 + e 4e 3 + e 4 4e 5 + 4e 6 + e 7. Here we will study the asymptotic mea square errors for the variace ad the covariace estimatio problems. Also, the asymptotic mea square error of the Wilcoxo s siged rak test will be discussed. Example 1. Variace estimatio; Let us cosider the kerel h(x, y) = (x y) /. The if V ar(x 1 ) = σ, the U- statistic ( ) 1 U = h(x i, X j ) C, is a ubiased estimator of σ. It is easy to see that θ = σ, g 1 (x) = 1 (x σ ) ad g (x, y) = xy. For the sake of simplicity, we will cosider the case that the distributio F (x) is symmetric about the origi. Let us defie m k = E[X k 1 ]. The because of symmetry of F, if k is odd umber, m k = 0. Usig this fact, from direct computatios, we ca show that ξ 1 = 1 4 (m 4 σ 4 ), ξ = σ 4, e 1 = 1 16 (m 8 4σ m 6 + 6σ 4 m 4 3σ 8 ), e = σ 4 (m 6 σ m 4 + σ 6 ), e 4 = 1 4 (m 4 σ 4 ), e 6 = σ4 (m 4 σ 4 ), e 7 = σ 8 ad e 3 = e 5 = 0. (Normal distributio:) If the uderlyig distributio is ormal, that is X i N(0, σ ), we ca show that b J = σ 4, b S = σ 4, b α = σ 4 ασ 4, { mse(v J ) = σ 8 56 + 68 }, mse(v S ) = σ 8 { 56 + 44 }, mse(v α ) = σ 8 { 56 + 4(α 30α + 67) } ad mse(v C ) = σ 8 { 56 + 00 I the case of σ = 1 ad = 10, Schucay ad Bakso (1989) discussed the mea square errors of V J /, V S / ad V C / by simulatio. Correspodig asymptotic mea square errors are give by mse(v J ) 10 = 0.088, mse(v S ) 10 = 0.0604 ad Their estimated mea square errors are close to these values. mse(v C ) 10 = 0.0760. }.

8 J. JAPAN STATIST. SOC. Vol.8 No.1 1998 (Logistic distributio:) We cosider the logistic distributio which has the desity fuctio πe πx 3σ. 3σ(1 + e πx 3σ ) I this case we have that V ar(x 1 ) = σ, ad b J = σ 4, b S = 5 σ4, b α = σ 4 16α 5 σ4, { 538.33 mse(v J ) = σ 8 + 100.95 } { 538.33, mse(v S ) = σ 8 { 538.33 mse(v α ) = σ 8 1 } (10.4α 1089.46α + 100.95) Example. mse(v U ) = σ 8 { 538.33 Covariace estimatio; + 764.89 }. 1135.0 }, Let {X i } i 1 be two dimesioal radom vectors. Ad puttig X i = (Y i, Z i ), we deote ( ) σy ρσ y σ z V ar(x 1 ) = V ar{(y 1, Z 1 )} =. ρσ y σ z Let us cosider a symmetric kerel h(x 1, x ) = (y 1 y )(z 1 z )/. The correspodig U-statistic is a ubiased estimator of ρσ y σ z = Cov(Y 1, Z 1 ). It is easy to see that θ = ρσ y σ z, g 1 (x 1 ) = 1 (y 1z 1 ρσ y σ z ) ad g (x 1, x ) = 1 (y 1z + z 1 y ). Further we assume that X i is bivariate ormal distributio ( ( )) σy ρσ y σ z X i = (Y i, Z i ) N µ,. ρσ y σ z From direct computatios we ca get Thus we have ξ1 = 1 + ρ σ 4 yσz, ξ = 1 + ρ σ yσz, e 1 = 3 16 (3ρ4 + 14ρ + 3)σyσ 4 z, 4 e = 1 8 (3ρ4 + 14ρ + 3)σ 4 yσ 4 z, e 3 = 0, e 4 = 1 8 (ρ4 + 6ρ + 1)σ 4 yσ 4 z, e 5 = 0, e 6 = 1 8 (ρ4 + 6ρ + 1)σ 4 yσ 4 z ad e 7 = 1 8 (ρ4 + 6ρ + 1)σ 4 yσ 4 z. b J = σyσ z(1 + ρ ), b S = σyσ z(1 + ρ ), b α = (1 α)(1 + ρ )σyσ z, { 8 mse(v J ) = σyσ 4 z 4 (ρ4 + 5ρ + 1) + 1 } (39ρ4 + 190ρ + 39), { 8 mse(v S ) = σyσ 4 z 4 (ρ4 + 5ρ + 1) + 1 } (7ρ4 + 30ρ + 7) { 8 mse(v α ) = σyσ 4 z 4 (ρ4 + 5ρ + 1) + 1 [α (ρ + 1) } 6α(3ρ 4 + 14ρ + 3) + 39ρ 4 + 190ρ + 39] σ z σ z

VARIANCE ESTIMATORS FOR U-STATISTICS 9 ad mse(v C ) = σ 4 yσ 4 z{ 8 (ρ4 + 5ρ + 1) + 1 (30ρ4 + 140ρ + 30)}. Remark. I the cases of the above tow examples, mse(v S ) < mse(v C ) < mse(v J ). As discussed i Schucay ad Bakso (1989), though Se s estimator V S has small mea square error, it has substatial egative bias. V C ad V U are asymptotically ubiased ad have smaller mea square error tha V J. But V C ad V U sometimes take egative values i small sample case. Schucay ad Bakso (1989) also poited out by simulatio that from the viewpoit of Pitma closeess V J is closer to σ tha V U. If we take α = 1 of V α, both biases ad mea square errors are relatively small. Especially i the case of the ormal distributio, the biases of V 1 are 0. Note that V 1 is asymptotically equivalet to (V J + V S )/ ad always takes a o-egative value. Example 3. Wilcoxo s siged rak test; I order to compare the mea square errors of the variace estimators, let us discuss the variace estimatio of the Wilcoxo s siged rak statistic. Let X 1,, X be a radom sample from the distributio F (x η), where F (x) satisfies F ( x) = 1 F (x) for ay x. So, the distributio F (x) is symmetric about origi. The Wilcoxo s siged rak statistic is very popular to test or to estimate η. For the sake of simplicity, we cosider the followig statistic ( ) 1 M = Ψ(X i + X j ) C, where Ψ(x) = 1, 0 if x 0, < 0. M is asymptotically equivalet to the Wilcoxo s statistic. Let us assume η = 0 ad F (x) has a desity fuctio. From direct computatio, we ca show that Thus we have θ = 1, σ = 1 6( 1), g 1(x) = F (x) 1, ξ 1 = 1 1, ξ = 1 1, e 1 = 1 80, e = 1 144, e 3 = 1 360, e 4 = 1 360, e 5 = 1 70, e 6 = 1 1440 ad e 7 = 1 70. b J = 1 6, b S = 1, b α = 1 6 α 3, mse(v J ) = 53 180, mse(v S) = 31 60, [ mse(v α ) = 1 ( 1 α 1 ) ] + 4 ad mse(v C ) = 4 9 15 9. It follows from the above calculatio that mse(v J ) < mse(u C ) < mse(v S ). Whe α = 1/, mse(v 1/ ) takes a miimum value 4/(15 ) ad mse(v 1/ ) < mse(v J ). Remark 3. Example 1 ad Example give us the same coclusio. But it is differet i the case of the Wilcoxo s statistic. So, we had better to check the mea square errors of the variace estimators usig Theorem i each case.

10 J. JAPAN STATIST. SOC. Vol.8 No.1 1998 4. Edgeworth expasios From Theorem 1, we ca regard the variace estimators as sum of U-statistics ad 1 term. For asymptotic U-statistics, Lai ad Wag (1993) have established the Edgeworth expasio with remaider term o( 1 ). Applyig their result, we ca get Edgeworth expasios of the variace estimators. Let us assume the followig coditios. (C 1 ) E h(x 1, X ) 8 < (C ) lim sup t E[exp{itf 1 (X 1 )}] < 1 (C 3 ) E f (X 1, X ) s < (s > 0) ad there exist K Borel fuctios ψ ν :R R such that E[ψν(X 1 )] < (ν = 1,, K), K(s ) > 4s+(8s 40)I {E f3 (X 1,X,X 3 ) >0}, ad the covariace matrix of (W 1,, W K ) is positive defiite, where W ν = (Lψ ν )(X 1 ) ad (Lψ ν )(y) = E[f (y, X )ψ ν (X )], ad I {.} is a idicator fuctio. The coditio C 3 is cocered with the umber of ozero eige fuctio of f (x, y). Alteratively Lai ad Wag (1993) have proved the validity of the Edgeworth expasio uder the followig coditio ( C 3 ). ( C 3 ) There exist costats c ν ad Borel fuctios w ν :R R such that E[w ν (X 1 )] = 0, E w ν (X 1 ) s < for some s 5 ad f (X 1, X ) = K ν=1 c νw ν (X 1 )w ν (X )a.s.; moreover, for some 0 < γ < mi{1, (1 11/ (3s))}, [ ( { lim sup t Let us defie sup u 1 + + u K t γ E exp it f 1 (X 1 ) + K u ν w ν (X 1 )})] < 1. ν=1 τ = E[f 1 (X 1 )], d 1 = E[f (X 1, X )], d = E[f 1 (X 1 )δ(x 1 )], d 3 = E[f 3 1 (X 1 )], d 4 = E[f 1 (X 1 )f 1 (X )f (X 1, X )], d 5 = E[f 4 1 (X 1 )], d 6 = E[f 1 (X 1 )f 1 (X )f (X 1, X )], d 7 = E[f 1 (X 1 )f 1 (X )f (X 1, X 3 )f (X, X 3 )], d 8 = E[f 1 (X 1 )f 1 (X )f 1 (X 3 )f 3 (X 1, X, X 3 )], κ 3 = τ 3 (d 3 + 1d 4 ), κ 4 = τ 4 (d 5 3τ 4 + 4d 6 + 48d 7 + 8d 8 ), P 1C (x) = x 1 6 κ 3, P 1S (x) = P 1C (x) b S 4τ, P J (x) = x τ (d 1 + d + b J 3 P 1J (x) = P 1C (x) b J 4τ, + κ 3 7 (x5 10x 3 + 15x), P S (x) = x ( ) τ d 1 + d τ + b S 3 + κ 3 7 (x5 10x 3 + 15x), P α (x) = x ( ) τ d 1 + d ατ + b α 3 + κ 3 7 (x5 10x 3 + 15x) P 1α(x) = P 1C (x) b α 4τ, ) + κ 4 b J κ 3 (x 3 3x) 4 + κ 4 b S κ 3 (x 3 3x) 4 + κ 4 b S κ 3 (x 3 3x) 4

ad VARIANCE ESTIMATORS FOR U-STATISTICS 11 P C (x) = x τ (d 1 + d ) + κ 4 4 (x3 3x) + κ 3 7 (x5 10x 3 + 15x). We have the followig theorem. Theorem 3. Assume that the coditios C 1 ad C hold. If either coditio C 3 or C 3 is satisfied, we have { (VJ σ } P ) x = Φ(x) φ(x)p 1J(x) φ(x)p J(x) + o( 1 ), 4τ { (VS σ } P ) x = Φ(x) φ(x)p 1S(x) φ(x)p S(x) + o( 1 ), 4τ { (Vα σ } P ) x = Φ(x) φ(x)p 1α(x) φ(x)p α(x) + o( 1 ), 4τ { (VC σ } P ) x = Φ(x) φ(x)p 1C(x) φ(x)p C(x) + o( 1 ) 4τ ad { (VU σ P ) 4τ } x = Φ(x) φ(x)p 1C(x) φ(x)p C(x) + o( 1 ). Proof. It is sufficiet to prove the case of V J. Sice Ũ = {V +8 δ(x i)/ + R 1; } is a asymptotic U-statistic, it follows from Lai ad Wag (1993) that } {Ũ P 4τ x = Φ(x) φ(x)p 1C(x) φ(x) P (x) + o( 1 ) where Sice P (x) = x τ (d 1 + d ) + κ 4 4 (x3 3x) + κ 3 7 (x5 10x 3 + 15x). { (VJ σ } P ) {Ũ x = P 4τ 4τ x b } J 4τ, expadig by b J /(4τ ), we have the Edgeworth expasio for V J. Example 4. Let us cosider the case of variace estimatio i Example 1. From direct computatio, we ca show that f 1 (x) = 1 4 (x σ ) 1 4 ξ 1 ad f (x, y) = 1 4 (x σ )(y σ ) xy (x + y σ ) + σ xy = 1 4 (x σ )(y σ ) 1 (x3 + x)(y 3 + y) + 1 x3 y 3 ( + σ + 1 ) xy.

1 J. JAPAN STATIST. SOC. Vol.8 No.1 1998 Thus puttig we have c 1 = 1 4, w 1(x) = x σ, c = 1, w (x) = x 3 + x, c 3 = 1, w 3(x) = x 3, c 4 = σ + 1 ad w 4(x) = x, f (X 1, X ) = 3 c ν w ν (X 1 )w ν (X ) ν=1 Assume that E X 1 < ad the uderlyig distributio F (x) has a desity fuctio. We ca show that [ ( { 4 lim sup sup t E exp it f 1 (X 1 ) + u ν w ν (X 1 )})] < 1. u 1 + + u 4 t 1 Hece the coditios (C 1 ), (C ) ad ( C 3 ) are satisfied. Appedix A First we review the momet evaluatios of the H-decompositio, which is very useful for discussig asymptotic properties. Let ν(x 1,, x r ) be a fuctio which is symmetric i its argumets ad E[ν(X 1,, X r )] = 0. Let us defie ad ρ 1 (x 1 ) = E[ν(x 1, X,, X r )], ν=1 a.s. ρ (x 1, x ) = E[ν(x 1, x,, X r )] ρ 1 (x 1 ) ρ 1 (x ),, r 1 ρ r (x 1, x,, x r ) = ν(x 1, x,, x r ) ρ k (x i1, x i,, x ik ). C r,k The we ca show that k=1 (A.1) E[ρ k (X 1,, X k ) X 1,, X k 1 ] = 0 a.s. ad where C,r ν(x i1,, X ir ) = r k=1 ( ) k Λ k r k Λ k = C,k ρ k (X i1,, X ik ). Usig the equatio (11) ad momet evaluatios of martigales (Dharmadhikari, Fabia ad Jogdeo (1968)), we have the upper bouds of the absolute momets of Λ k as follows. Lemma 1. For q, if E ν(x 1,, X r ) q <, there exists a positive costat c, which may deped o ν ad F but ot o, such that (A.) E Λ k q c qk. For the simplicity we use a symbol o p( 3/ ) which may be differet i each case but satisfies E o p( 3 ) + ε = O( 4 ε ).

VARIANCE ESTIMATORS FOR U-STATISTICS 13 It follows from Markov s iequality that o p( 3/ ) = 1/ o p ( 1 ). From Markov s iequality ad (1), we ca easily obtai the followig lemma which is useful for obtaiig the asymptotic represetatios. Lemma. have that If E[ν(X 1,, X r )] = 0 ad E ν(x 1,, X r ) +ε < for ε > 0, we (A.3) r 1 C,r ν(x i1,, X ir ) = ad (A.4) r r k=4 1 (r 1)! Λ 1 + o p( 3 ). ( ) k Λ k = o r k p( 3 ). Usig the above lemmas, we will prove Theorem. Approximatio of V J At first we will obtai the approximatio of V J. Let us defie D 1 = g1(x i ), D = g 1 (X i )g 1 (X j ), C, D 3 = C,{g 1 (X i ) + g 1 (X j )}g (X i, X j ), D 4 = C,3{g 1 (X i )g (X j, X k ) + g 1 (X j )g (X i, X k ) + g 1 (X k )g (X i, X j )}, D 5 = C, g (X i, X j ), D 6 = C,3{g (X i, X j )g (X i, X k ) + g (X i, X j )g (X j, X k ) + g (X i, X k )g (X j, X k )} ad D 7 = C,4{g (X i, X j )g (X k, X l ) + g (X i, X k )g (X j, X l ) From Maesoo (1995, p.18), we have (U (i) U ) = + g (X i, X l )g (X j, X k )}. 4 ( 1) D 1 8 ( 1) D + 8 ( 1) D 3 16 ( 1) ( ) D 8 4 + ( 1) ( ) D 5 8( 4) + ( 1) ( ) D 16 6 ( 1) ( ) D 7.

14 J. JAPAN STATIST. SOC. Vol.8 No.1 1998 Note that V J = ( 1) Lemma, we have that 4 D 1 = 4ξ 1 + 4 8 ( 1) D 3 = 8 + (i) (U {g1(x i ) ξ1}, U ). Usig the H-decompositio, Lemma 1 ad E[g 1 (X 0 )g (X i, X 0 ) X i ] 8 ( 1) C, {[g 1 (X i ) + g 1 (X j )]g (X i, X j ) E[g 1 (X 0 )g (X i, X 0 ) X i ] E[g 1 (X 0 )g (X j, X 0 ) X j ]}, 8 ( 1)( ) D 5 = 4ξ + 8 δ(x i ) + o p( 3 ), 8( 4) ( 1)( ) D 8 6 = E[g (X i, X 0 )g (X j, X 0 ) X i, X j ] ( 1) C, 8 + β(x i, X j, X k ) + o ( 1)( ) p( 3 C,3 ) ad 16 ( 1)( ) D 7 = o p( 3 ) where X 0 is a radom vector with distributio F (x) ad is idepedet of X 1,, X ad β(x, y, z) = g (x, y)g (x, z) + g (x, y)g (y, z) + g (x, z)g (y, z) E[g (x, X 0 )g (y, X 0 ) + g (y, X 0 )g (z, X 0 ) + g (x, X 0 )g (z, X 0 )]. Thus we have the equatio (3.1). Approximatio of V S It follows from the equatio (.) that { V S = 1 } + O( ) V J. From the equatio (A.3), we ca show that ad V J = 8ξ 1 8 Thus we have the equatio (3.). f 1 (X i ) + o p( 3 ) O( )V J = o p( 3 ). Approximatio of V α Similarly as V S, we ca easily obtai the equatio (3.3). Approximatio of V C

VARIANCE ESTIMATORS FOR U-STATISTICS 15 To obtai the equatio (3.4), it is sufficiet to prove the followig lemma which is a improvemet of Lemma A4 i Maesoo (1995). Lemma 3. If E h(x 1, X ) 4+ε < for some ε > 0, we have 1 + 1 1 i<j (Q i,j Q) = ξ + 4 δ(x i ) + o p( 3 ). Proof. 1 + 1 From the proof of Lemma A4 i Maesoo (1995), we have = 1 i<j (Q i,j Q) 4 ( + 1)( 1)( 3) D 8 5 ( + 1)( 1)( )( 3) D 6 16 + ( + 1)( 1)( )( 3) D 7. Usig the H-decompositio ad the equatios (A.3) ad (A.4), we get that ad 4 ( + 1)( 1)( 3) D 5 = ξ + 4 δ(x i ) + o p( 3 ), 8 ( + 1)( 1)( )( 3) D 6 = o p( 3 ) Thus we have the equatio (3.5). 8 ( + 1)( 1)( )( 3) D 7 = o p( 3 ). Approximatio of V U Fially we will cosider the ubiased estimator V U. We will obtai approximatios of â 1 ad â. Let us cosider ˆλ 1. From the defiitio, we ca get E[h(x, X )] = g 1 (x) + θ, h(x, y) = g (x, y) + g 1 (x) + g 1 (y) + θ. Usig these equatios ad (.1), we ca show that We also have E[ζ 1 (x, y, X 3 )] = 1 3 {g (x, y)[g 1 (x) + g 1 (y)] + 3g 1 (x)g 1 (y) + g 1(x) + g 1(y) E[ζ 1 (x, X, X 3 )] + E[g (x, X 3 )g (y, X 3 ) + (g (x, X 3 ) + g (y, X 3 ))g 1 (X 3 )] + θg (x, y) + 4θg 1 (x) + 4θg 1 (y) + ξ 1 + 3θ }. = 3 {E[g (x, X 3 )g 1 (X 3 )] + ξ 1} + 4 3 θg 1(x) + θ + 1 3 g 1(x)

16 J. JAPAN STATIST. SOC. Vol.8 No.1 1998 ad E[ζ 1 (X 1, X, X 3 )] = ξ1 + θ. Here we have E[ζ 1 (x, X, X 3 )] ξ1 θ = 3 E[g (x, X )g 1 (X )] + 4 3 θg 1(x) + 1 3 {g 1(x) ξ1} = g 1 (x) (say), E[ζ 1 (x, y, X 3 )] ξ1 θ g 1 (x) g 1 (y) = 1 3 E[g (x, X 3 )g (y, X 3 ) {g (x, X 3 ) + g (y, X 3 )}g 1 (X 3 )] + g 1 (x)g 1 (y) + 1 3 {g 1(x) + g 1 (y) + θ}g (x, y) = g (x, y) (say) ad ζ 1 (x, y, z) ξ1 θ g (x, y) g (x, z) g (y, z) g 1 (x) g 1 (y) g 1 (z) = 1 3 {g (x, y)g (x, z) + g (x, y)g (y, z) + g (x, z)g (y, z) E[g (x, X 3 )g (y, X 3 ) g (x, X 3 )g (z, X 3 ) g (y, X 3 )g (z, X 3 )]} + 3 {g 1(x)g (y, z) + g 1 (y)g (x, z) + g 1 (z)g (x, y)} = g 3 (x, y, z) (say). Thus usig the H-decompositio, we ca show that (A.5) ˆλ 1 = ξ1 + θ + 3 6 g 1 (X i ) + g (X i, X j ) ( 1) C, 6 + g 3 (X i, X j, X k ). ( 1)( ) C, Next we will obtai a approximatio of ˆθ. Similarly as ˆλ 1, we ca get E[ζ 0 (x, y, z, X 4 )] = 1 3 {g 1(x)g (y, z) + g 1 (y)g (x, z) + g 1 (z)g (x, y) E[ζ 0 (x, y, X 3, X 4 )] + θg (x, y) + θg (x, z) + θg (y, z) + g 1 (x)g 1 (y) + g 1 (x)g 1 (z) + g 1 (y)g 1 (z)} + θg 1 (x) + θg 1 (y) + θg 1 (z) + θ, = 1 3 {g 1(x)g 1 (y) + θg (x, y)} + θg 1 (x) + θg 1 (y) + θ ad E[ζ 0 (x, X, X 3, X 4 )] = θ{g 1 (x) + θ}.

VARIANCE ESTIMATORS FOR U-STATISTICS 17 Thus from the H-decompositio ad the equatio (A.4), we have (A.6) ˆθ = θ + 4 + + θg 1 (X i ) 4 ( 1) 8 ( 1)( ) {g 1 (X i )g 1 (X j ) + θg (X i, X j )} C, C,3 {g 1 (X i )g (X j, X k ) + g 1 (X j )g (X i, X k ) + g 1 (X k )g (X i, X j )} + o p( 3 ). Combiig the equatios (A.5) ad (A.6), we have the approximatio of â 1 as (A.7) 4( ) 1 â 1 = 4ξ 1 4ξ 1 4 f 1 (X i ) + V + o p( 3 ). Sice E[h (X 1, X )] = ξ 1 + ξ + θ, usig the H-decompositio ad the equatio (13), we obtai 1ˆλ = ξ 1 + ξ + θ + {δ(x i ) + f 1 (X i ) + θg 1 (X i )} + o p( 3 ). From the H-decompositio ad the equatios (A.3) ad (A.4), we ca show that (A.8) â 1 = 4 {δ(x i ) + f 1 (X i )} + 4ξ 1 + ξ + o p( 3 ). Combiig the above evaluatios (A.7) ad (A.8), we have the desired approximatio (3.5). Ackowledgemets The author wishes to thak the referee for helpful commets. He is also grateful to the hospitality of the Cetre for Mathematics ad its Applicatios at the Australia Natioal Uiversity, where he carried out a part of this study. Refereces [1] Arvese, J. N. (1969). Jackkifig U-statistics. A. Math. Statist., 40, 076 100. [] Dharmadhikari, S. W., Fabia, V. ad Jogdeo, K. (1968). Bouds o the momets of martigales. A. Math. Statist., 39, 1719 173. [3] Efro, B. (1987). Better bootstrap cofidece itervals (with discussio). Jour. Amer. Statist. Assoc., 8, 171 00. [4] Efro, B. ad Stei, C. (1981). The jackkife estimate of variace. A. Statist., 9, 586 596. [5] Hikley, D. V. (1978). Improvig the jackkife with special referece to correlatio estimatio. Biometrika, 65, 13 1. [6] Lai, T. L. ad Wag, J. Q. (1993). Edgeworth expasio for symmetric statistics with applicatios to bootstrap methods. Statistica Siica, 3, 517 54.

18 J. JAPAN STATIST. SOC. Vol.8 No.1 1998 [7] Maesoo, Y. (1995a). Comparisos of variace estimators ad their effects for studetized U-statistics. The Australia Natioal Uiversity Statistics Research Report, No. SRR 40 95. [8] Maesoo, Y. (1995b). O the ormal approximatio of a studetized U-statistic. Jour. Jap. Statist. Soc., 5, 19 33. [9] Maesoo, Y. (1996a). Edgeworth expasios of a studetized ad a jackkife estimator of variace. to appear i Jour. Statist. Pla. If. [10] Maesoo, Y. (1996b). A Edgeworth expasio of a liear combiatio of U-statistics. Jour. Jap. Statist. Soc., 6, 189 07. [11] Schucay, W. R. ad Bakso, D. M. (1989). Small sample variace estimators for U- statistics. Austral. J. Statist., 31, 417 46. [1] Se, P. K. (1960). O some covergece properties of U-statistics. Calcutta Statist. Assoc. Bull., 10, 1 18. [13] Se, P. K. (1977). Some ivariace priciples relatig to jackkifig ad their role i sequetial aalysis. A. Statist., 5, 316 39. [14] Shirahata, S. ad Sakamoto, Y. (199). Estimate of variace of U-statistics. Commu. Statist.-Theory Meth., 1, 969 981.