** Professor of Finance, College of Business Administration, University of Houston, Houston, TX Tel. (713)

Rankng Portfolo Performance by a Jont Means and Varances Equalty Test by Joel Owen* and Ramon Rabnovtch** February 998 Professor of Statstcs, Stern School of Busness, New York Unversty, 44 West Fourth Street, New York, N.Y. 00. Tel. () 998 0446. EMAIL: JOWEN@STERN.NYU.EDU ** Professor of Fnance, College of Busness Admnstraton, Unversty of Houston, Houston, TX 7704-68. Tel. (73) 743 478. EMAIL: RAMON@UH.EDU

. Introducton In the last four decades, numerous authors have suggested methods to evaluate portfolo performance. Treynor (965), Sharpe (966), and Jensen (968), proposed performance measures whch produce a score for every portfolo beng evaluated. These scores are used to compare the performance of any two portfolos or rank the performance of all portfolos n a gven set. Later, Hendrksson and Merton (98), Lehmann and Modest (987), Moses, Chaney, and Vet (987), Grnblatt and Ttman (989a), Okunev (990) and others, developed new ways to evaluate portfolo performance. In ther methods, management market tmng and selecton abltes receved ncreasng attenton. The general dea common to most of these works s that the method assgns to every portfolo a numercal score, often called excess return. Ths score s derved by takng the dfference between the average portfolo return and the expected portfolo return predcted by some model. Usually, the score assgned to every portfolo depends on the rsk-free rate and a benchmark (market) portfolo. For example, denote the Jensen score by J and use the CAPM, then usng the pror populaton parameters, J = µ - R f - β(µ m - R f ), where µ m s the mean return of an effcent portfolo whose return s R m, µ s the mean return of the portfolo beng evaluated, whose return s R, β = cov (R,R m )/σ m, where cov s the covarance operator and σ m s the varance of R m. J depends on the rsk-free rate, R f, and on the portfolo m, whch needs to be an effcent portfolo. Roll (977, 978) and Green (986) ponted out, however, that the concepts of effcency and excess return need not be consstent. Moreover, Dybvg and Ross (985a) showed that even f an effcent portfolo s used as a benchmark, both superor as well as nferor portfolos could produce postve J values, thus castng doubt on the usefulness of ths approach. In fact, Dybvg and Ross (985b) showed that an unnformed observer may calculate a postve or a negatve J score even when evaluatng a manager wth superor nformaton. They proved that f one uses ths method of evaluaton, a manager wth superor nformaton mght appear to be performng n a suboptmal fashon. Grnblatt and Ttman (989a and 989b) offered further crtcsms when models other than the CAPM were the bass of the performance evaluaton. The crtcsm mentoned so far represents theoretcal arguments based on populaton parameters. In practce, addtonal problems arse. For example, Lehmann and Modest (987) conclude that the method of estmaton used n the evaluaton substantally changes the J score. Complcatng ths estmaton problem s the uncertanty as to whether the benchmark portfolo beng estmated s tself effcent. If t s not effcent, does t stll matter, whch benchmark portfolo s used? On the last pont a study by Grndblatt and Ttman (99a) concludes that t does matter and Baly (99) reaches the same concluson. Although, Grndblatt and Ttman (989a) attempt to deal wth some of these ssues, no partcular method of evaluaton has found general acceptance. Moreover, whle there s a growng body of lterature on the estmaton and testng the propertes of effcent portfolos, see e.g., Kandel, McCulloch and Stambauch (995) and the references theren, there s very lttle on the samplng errors of performance measures. In the case of the Jensen measure, other than tests to determne whether a partcular score s statstcally dfferent from zero, no attempt has been made to deal wth samplng errors when rankng portfolos. That s, two portfolos, each of whose score was deemed sgnfcant, were then ranked by these scores wthout any regard as to whether they are sgnfcantly dfferent from one another or not. An excepton s Jobson and Korke (98), who proposed statstcal tests of equalty of the Sharpe and (separately)

the Treynor measures, based on ther sample estmates. In ths paper, we develop a new portfolo performance evaluaton method that addresses the ssue of samplng errors drectly, wthout the need to use an effcent portfolo or any other partcular benchmark n order to determne the rankngs. The 99 annual report of TIAA-CREF - one of the largest funds n the world - stated that the goal for CREF's stock fund was "...for CREF partcpants to earn stock market-type nvestment returns, but wth less rsk (fluctuatons of returns) than the market." Ths stated goal for the fund ndcated the wsh to manage the fund such that t wll domnate other nvestments n some rskreturn sense. Thus, the frst stage of the method we propose conssts of a new statstcal method that drectly compares the performance of any two portfolos based on the mean-varance domnance (MV) crteron. Gven a set of N portfolos, we drectly compare every portfolo aganst each of the remanng N - portfolos, thereby producng an NxN matrx contanng the outcomes of all possble parwse comparsons. In the second stage of the method we employ a rankng functon whch maps the elements of the comparson matrx nto a complete rankng of the portfolo set. In general, our approach leads to a very dfferent rankng of portfolos than those gven by the Treynor, the Jensen and the Sharpe measures. We dscuss, however, the condtons under whch both approaches produce smlar rankngs. The method we propose s dfferent from other performance measures n several regards: () The methods mentoned above assgn an ndependent score to every portfolo and then, use these scores to compare and rank the portfolos. We take the reverse approach. Frst, our method produces drect parwse comparsons among the portfolos and only then we rank them based on these comparsons. () The bass for the other methods rankng are scores that depend on a unvarate measure concernng returns. That s, the rank s based on a condtonal rsk-adjusted average return, or a sutably standardzed average return. We propose a rankng based on a bvarate crteron related to mean-varance domnance. () These methods do not account for samplng errors. The method we propose s based on a statstcal argument. Hence our procedure s mplementable and justfable even n a short tme perod and for fxed sample szes. (v) Our method s based on parwse comparsons of all the portfolos thus, t s ndependent of a partcular benchmark. As such, t does not allow for a manager to manpulate the rankngs by gamng. (v) Our method does not depend on the exstence of a rsk-free rate. We show, however, how to approprately adjust the procedure when a rsk-free rate does exst. We proceed as follows: In Secton we present a two-stage statstcal procedure of orderng and rankng portfolo performance and llustrate ths procedure wth an example, usng eght portfolos. In Secton 3, we llustrate the use of the method by applyng t to a set of 33 mutual funds. The frst 30 funds n ths set were frst studed by Lehmann and Modest (987) and agan by Connor and Korajczyk (990). To ths set we added the S&P500 ndex and the equal and value weghted CRSP ndexes. We produce our rankng of these 33 funds over three dfferent tme perods, 5, 0, and 5 years. In Secton 4 we compare the rankng we obtan wth the rankng calculated by the Treynor, the Sharpe, and the Jensen measures. In Secton 5 we nvestgate several theoretcal connectons between our method and the Treynor, the Sharpe, and the Jensen measures. We offer some addtonal remarks n Secton 6.. A Procedure of Rankng Portfolo Performance 3

In ths secton we develop a statstcal method of orderng and rankng portfolos followng the concept of Mean-Varance (MV) Domnance.. The Theoretcal Foundaton of the Test of Domnance We follow the usual defnton of MV domnance : Defnton. Let R j denote the return on portfolo j wth mean µ j and varance σ j, j =,...,N. For any two portfolos n the set, j and k, we say that portfolo j domnates portfolo k f µ j µ k and σ j σ k, wth at least one nequalty beng strct. Ths type of domnance s denoted by R j DR k. If portfolo j s domnated by portfolo k we wrte R j DRk. If the means are equal and the varances are equal too, we say that the two portfolos are equal and denote ths by R j ER k. If nether domnance nor equalty determnes the relatonshp between the two portfolos, we call the portfolos noncomparable and denote t by R j NCR k.// The followng three theorems establsh a new method of determnng MV domnance between any two portfolos. We begn by recallng a basc result n regresson theory. Lemma A. Let Y = β0 + β Χ + ε represent a smple lnear regresson model, =,...,N, wth ε beng ndependent and normally dstrbuted wth mean zero and varance σ. Then the smultaneous Null hypothess H o : β 0 = β = 0 may be tested aganst the general alternatve, usng the N - uncorrected F statstc UF = [ (Y - (Y - Ŷ ) ] / (Y - Ŷ ), where Y s the least-squares ftted model. Under H o, UF has a central F dstrbuton wth and N - degrees of freedom. Small values of UF do not allow H o to be rejected. Proof: See any text on testng the general lnear hypothess. Ths result leads to a statstcal method for testng the smultaneous equalty of the mean returns and the return varances of any two portfolos: Theorem A. Let R t be the random return on portfolo, =,...,N at tme t=,..., T. Assume that the vectors (R t, R jt ) are a random sample from a bvarate normal dstrbuton, and run the regresson: Y t = β0 + β (X t - X) + ε t t =,...,T, where, Y t = R jt - R t, X t = R jt + R t and X = X t ;, j =,...,N. Then, the correspondng UF-statstc gven n Lemma T A, can be used to test the smultaneous hypothess: H o : µ j = µ and σ j = σ versus H a : µ j µ or σ j σ. Ths UF-statstc follows an F,T- dstrbuton under H o. Proof: See Bradley and Blackwood (989). Two remarks are n place: Remark : Notce that the UF-statstc s not the usual F statstc, n that t s testng whether both the ntercept and slope of the regresson lne are smultaneously zeros. The essence of the proof of Theorem A nvolves the recognton that the least square estmators of the ntercept and slope are estmatng EY = µ, and cov (Y, X) E[(Rj - µ j ) - (R - µ )][(Rj - µ j ) + (R - µ )] σ j -σ = =, respectvely. Thus, the two V(X) coeffcents wll be zero f and V(X) only f V(X) µ j = µ and σ j = σ. Remark : The regressons n Theorem A (and n Theorem B below,) do not n any way, consttute return generatng models. These regressons are only used as a tool (a trck, f the reader wshes to call t ths way) that enables us to obtan the UF-statstc needed for the test of the above 4

smultaneous Null hypothess. Ths s acheved by defnng Y t to be the dfference between the returns on any two portfolos and X t to be the sum of the returns on the same two portfolos. These defntons do not descrbe any economc model between the returns on the portfolos or between the dependent and ndependent varables and nether portfolo need be a benchmark portfolo. The next theorem generalzes Theorem A to the case when the vectors (R jt, R t ) are a sample from any ellptcally contoured dstrbuton. Lemma B. Let Y = β0 + β Χ + ε represent a smple lnear regresson model, =,...,N, wth ε beng jontly ellptcally dstrbuted wth zero means. Then, the smultaneous null hypothess H o : β o = β = 0 may be tested aganst the general alternatve usng the UF-statstc of Lemma A. Furthermore, the dstrbuton of UF under H o s the same as under normalty: F,N-. Proof: Theorem 6 n Anderson and Fang (990) establshes that when the error term s ellptcal, the lkelhood rato test crteron for testng H o : β o = β = 0 s the same as under the normal case. Here ths crteron s: [ΕY / Ε(Y -!) ] n/ whch s a monotonc functon of UF of Lemma A. We now generalze Theorem A to the case of ellptcal dstrbutons. Theorem B. Let (R jt, R t ), t =,...T, be a sample from a bvarate ellptcal dstrbuton. Then, the UF-statstc of Lemma A can be used to test the smultaneous hypothess H o : µ j = µ and σ j = σ aganst the general alternatve. Under the Null hypothess, the dstrbuton of UF s F,T-. Proof: {R jt, R t } are bvarate ellptcal, then (Y t, X t ) n Theorem A are ellptcal and Y t s condtonally lnear n X t ; Y t may be wrtten as Y t = β0 + β (X t - X) + εt, where ε t are ellptcally dstrbuted. See Owen and Rabnovtch (983). Lemmas A and B then lead to the same concluson as n Theorem A. So far, theorems A and B establsh a test for the smultaneous equalty of the means and the varances of ellptcally dstrbuted bvarates. If the Null hypothess cannot be rejected based on the UF-statstc, we conclude that R j ER. Theorem C below, establshes the mplcatons of the rejecton of H o usng the statstcal propertes of the ndvdual regresson coeffcents. Theorem C. For = 0,, let t be the t-statstc assocated wth the least squares estmator β that s used to test H o : β = 0 versus H a : β 0, respectvely. Gven that the UF-statstc s sgnfcant, the t o - value may be used to test H o : µ j = µ versus H a : µ j µ and the t -value may be used to test H o : σ j = σ versus H a : σ j σ. Proof: Testng β o = 0 s equvalent to testng µ j = µ. Testng β = 0 s equvalent to testng σj - σ = 0 or σ j = σ.. V( Χ) The Statstcal Test of Domnance Wthout a Rsk-Free Rate We summarze the frst stage of our two-stage procedure of rankng portfolo performance, usng theorems A, B and C as follows: Let {R t } be a random sample of N portfolo returns, =,...,N durng a sample perod t, t =,...,T. Assume that R and R j follow a bvarate ellptcal dstrbuton whch s statonary over tme. In order to run the regresson n Theorem A n a systematc manner, fx j = and run N - regressons, where Y t = R t - R t and X t = R t + R t for portfolos : =,3,...,N. For each of these N - regressons, test the smultaneous hypothess of Theorem A. If the UF-statstc s not sgnfcant, report that R ER. If, on the other hand, the UF-statstc s sgnfcant, reject H o,.e., conclude that the smultaneous equalty of the means and varances does not hold. In the latter case, use the regresson's t-values to test the separate hypotheses on the equalty of the means and the equalty of the varances. Table specfes all the 5

* possble results of these tests, where t k (t k ), k = 0,, denotes a sgnfcant (nsgnfcant) t-value at the desred level of sgnfcance. Note that f, followng a sgnfcant F test, nether t o nor t s sgnfcant, we record the result as a noncomparable result. We show how to resolve the cases of noncomparable portfolos n Secton.3 below. To contnue, notce that the result of the N - regresson runs of R on R, =,3,...,N, and the above tests for each regresson, s a vector of MV comparsons between portfolo j = and the rest of the portfolos n the set. Every entry n ths comparson vector s: R DR, or R ER, or R D R or R NCR. For smplcty, we also run R on tself thus, the frst entry n the frst comparson vector, (j=, =), s R ER, trvally. The next step s to compare portfolo j = to all the other portfolos. Thus, fx j = and repeat the prevous teraton of regresson runs and tests of hypotheses for all, =,...,N. A second vector Table The Results of the Separate Tests on the Means and the Varances t test t > 0 * ( σj > σ ) t < 0 * ( σj < σ ) t ( σj = σ ) * t o > 0 (µ j > µ ) R j NCR R j DR R j DR t o - test t * o < 0 (µ j < µ ) R Dˆ R R j NCR R Dˆ R t o (µ j = µ ) R Dˆ R R j DR R j NCR wll result, yeldng nformaton about the MV orderng of portfolo relatve to all the portfolos n the set. Agan, runnng R on tself yelds R ER trvally, n the second entry of ths vector. Contnue these teratve steps for j = 3, j = 4, and so on up to j = N. Every teraton produces another comparson vector so that after N teratons, we have created an NxN matrx of comparsons. To smplfy the dscusson, we transform the nformaton n each entry of the comparson vectors nto a numercal value usng a comparson functon: Defnton. For,j =,...,N, a comparson functon, COMP(,j), s defned as follows: f R j DR, 0 f R j ER, COMP(,j) = - f R j Dˆ R, 4 f R j NCR.// Defnton 3. A comparson matrx C s a matrx whose elements are COMP(,j); j,, =,...,N.// From defntons and 3, t follows that the elements of the matrx C are the real numbers,0,- and 4, dependng on the type of domnance found n the hypothess testng that followed the regressons of Theorem A. 6

* The frst stage of the procedure could end at ths pont. Further refnement s possble, however, when a rsk-free rate exsts. In the next secton we show how to employ the rsk-free rate n order to resolve all the cases of noncomparable portfolos..3 An Adjustment n Case That There Exsts a Rsk-Free Rate, R f. Suppose that a rsk-free rate, R f, exsts n the economy. Frst, observe that subtractng R f from all the portfolos' returns and performng stage one of the procedure wth "access returns", leaves the above results unchanged because of the way Y t, X t and the regresson of Y t on X t are defned. We use R f, however, n order to resolve the cases of noncomparable portfolos. To do ths, we extend the noton of domnance n the followng way: Defnton 4. Let portfolo and j be a par of noncomparable portfolos. For portfolo compute a new return, whch s the lnear combnaton of ts orgnal return and the rsk-free rate: R * = ( - δ)r f + δr. Then, portfolo j s sad to domnate (be domnated by) portfolo f there exsts a δj, such that * * employng R and R j leads to R j D R (R j D * R ) for that δj. If portfolo j does not domnate or s not beng domnated by portfolo, we say that they are equal.// Theorem D below establshes that all the cases of noncomparable portfolos are resolved under defnton 4 of extended domnance. For smplcty we drop the subscrpts j n the sequel. Theorem D. Let and j be a par of noncomparable portfolos and assume that a rsk-free rate R f exsts. Then, there exsts a δ wth the correspondng R * * such that R j D R, R j E R *, or R j D * R Proof: Let R be the sample mean return of portfolo. If δ = ( R j R f ) / ( R - R f ). R R f 0, then, a possble choce s: Adjust R usng ths δ and run the regresson of Theorem B wth Y t and X t redefned wth R j band R *. It can be shown that b0 = 0 for ths choce of δ, whch mples that µ * = µ j Thus, f the UF-statstc s not sgnfcant, t follows that σ * = σ j as well,.e, R j ER. If, on the other hand, the UF-statstc s sgnfcant, t follows that the varances are dfferent. Ths fact, coupled wth the fact that the means are equal, mples that domnance must exst between the two portfolos. If µ = R f but µ j R f, the same concluson follows by reversng the roles of the portfolos. If R = R = R j f then, agan, b0 = 0 and the concluson follows. Ths completes the proof. For the complete procedure that we propose below, we frst construct the matrx C wthout the use of R f. We then ntroduce the rsk-free rate and resolve all the cases of the noncomparable portfolos by settng δ equal to the value n Theorem D and comparng the resultng R * to R j. Note that we could have resolved the comparson between noncomparable portfolos by a complete search over δ values. Instead, we chose a smpler form of the algorthm. As Theorem D asserts, the result of ths comparson resolves all the cases of noncomparabe portfolos. We use the notaton COMP( *,j) to refer to the comparson of R * wth R j, wth δ chosen as n Theorem D. Therefore, for these cases we now have COMP( *,j) = -, 0, or, where we prevously had COMP(,j) = 4. In ths way the matrx C s transformed nto a rsk-free rate adjusted matrx that contans no noncomparable portfolos n t. Defnton 5. The rsk-free rate adjusted comparson matrx C f s the matrx wth the elements COMP f (,j): COMP f (,j) = COMP(,j), for all the cases n whch COMP(,j) = -, 0,. 7

COMP f (,j) = COMP(*,j), for all the cases n whch COMP(,j) = 4.// Ths concludes stage one of our procedure n whch we create comparson matrces contanng the results of the parwse comparsons of all the portfolos n the set. In stage two the procedure s completed by usng a rankng functon that maps the nformaton n the comparson matrx n order to rank the portfolos,.e., convertng the C f, (or the C) matrx nto a complete rankng of the portfolos..4 The Rankng Functon The j-th column (j =,...,N) of the rsk-free rate adjusted comparson matrx C f, contans the results of comparng the j-th portfolo aganst each and every portfolo n the set, expressed by COMP f (,j). In a more general context, COMP f (,j) mght be some measured dfference between portfolos j and. In our analyss these results are expressed as real numbers COMP f (,j) satsfyng: COMP f (,j) + COMP f (j,) = 0, and COMP f (,j) =, 0, -,,j =,...,N. We propose to rank the portfolos n a descendng order of the row-sum as follows: Defnton 6. Let us use the matrx C f n whch the entres are: COMP(,j) =, 0, or -. A rankng functon s defned by the row-sum functon: s j = Σ N COMP f (,j), j =,...,N, and the portfolos are ranked n a descendng order of s j, breakng tes at random, or arbtrarly.// By Defnton 6, the "best" portfolo, ranked number, s the portfolo wth the hghest value of s j, the "second best" portfolo, ranked number, s the portfolo wth the second hghest s j score, and so on, down to the " worst" portfolo, ranked number N, whch s the portfolo wth the lowest value of s j. Ths rankng functon follows from Huber's (963) analyss of parwse comparsons and rankng. Theorem n Huber (963) demonstrates that under some very general condtons, the rankng n descendng order of s j, breakng tes arbtrarly or at random, unformly mnmzes the rsk among all (nvarant) rankng procedures that depend on COMP f (,j) only through s j, for all reasonable rsk functons. Moreover, Huber shows that mnmum rsk s acheved n cases n whch the COMP f (,j)s are not necessarly ndependent random varables, as well n the case n whch the COMP f (,j)s are ndependent random varables. (The form of the probablty dstrbutons of COMP f (,j) and s j and ther propertes are beyond the scope of ths work and s currently under nvestgaton.) Fnally, we pont out that the row-sum rankng functon s j apples to the comparson matrx C as well, wth the slght modfcaton of redefnng COMP(,j) = 4 to be zero, snce the correspondng portfolos are noncomparable. In concluson, the two-stage rankng procedure descrbed n the above sectons may be summarzed as follows: n the frst stage, a statstcal test of MV domnance determnes the order of every portfolo wth respect to all other portfolos. Ths nformaton s presented n the form of the rsk-free rate adjusted comparson matrx C f (and/or C). In the second stage, the row-sum rankng functon s j s employed to rank the portfolos from the "best", ranked, to the "worst", ranked N. We refer to ths procedure as the O-R procedure..5 An Eght-Funds Portfolo Illustratve Example of the O-R Procedure In Secton 3 below, we apply the O-R procedure to a set of 33 portfolos. Much of the analyss and the tables of the 33 funds are too lengthy to be ncluded n the paper. Therefore, we end ths secton wth an example usng a subset of eght portfolos. Wth ths small subset of portfolos, the example exhbts the O-R procedure n ts entrety thus, enablng the reader to verfy 8

every step of the procedure by nspecton. The data are the portfolos' monthly returns over a perod of 5 years. The results for the eght portfolos chosen for the llustraton are presented n tables and 3. The upper part of Table dsplays the UF-statstc and the statstcs t o, t from the regressons descrbed n Theorem A, usng the level α=.005 for the F test and α=.0 for the t-tests. The crtcal UF and t values at these levels are 5.3 and.6, respectvely. (Other reasonable values of sgnfcance levels produced smlar results.) For example, the regresson between portfolos 4 and 00, n whch Y t = R 4,t - R 00,t and X t = R 4,t + R 00,t, resulted n a sgnfcant UF = 6. > 5.3 and therefore, the rejecton of H 0. Then, the t -value, t = - 0.8 <.6, mpled the equalty of these portfolos' varances, but t 0 = 3.4 >.6 mpled that µ 4 > µ 0. Thus, R 4 DR 00 and COMP(,) = n Matrx C. The comparson matrx C s dsplayed n the lower-left part of the table. Smlar explanaton apples to all the 64 parwse comparsons among the eght portfolos, ncludng the eght trval comparsons that lead to zeros along the man dagonal of the lower-left sde of the table. The rankng functon s j and the O-R rankng based on the matrx C are shown n the lower-rght part of the table. For the eght portfolos n the example, the rankngs may be arrved at by nspecton. It s seen that portfolo 4, (j = ), s the "best" wth s = 6, portfolo 3 - the S&P500 Index wth dvdends, (j = 6), s the "second best" wth s 6 =, breakng the te between portfolo 3 and portfolo 0 (whose s 4 =, as well), arbtrarly. The rankng functon contnues n a descendng order of s j to rank portfolos 00, 30, 33 - the CRSP value-weghted ndex wth dvdends, 3 - the CRSP equal-weghted ndex wth dvdends and fnally, portfolo 5, wth s 3 = - 6, s ranked number 8, the "worst." In Table we fnd that portfolos 5 and 30 are noncomparable, COMP(5,3) = COMP(3,5) = 4, and so are portfolos 30 and 3, COMP(7,5) = COMP(5,7) = 4. To convert matrx C to matrx C f, we employed the rsk-free rate, whch was taken to be the average monthly yeld on T-blls over the sample perod of 5 years. Table 3 dsplays the rsk-free rate adjusted comparsons matrx C f and the Table The O-R Procedure: An eght-funds example. Sample perod: 968-98 Fund 00 4 5 0 30 3 3 33 00 t 0 3.4.6 0.8 -. -0.. 0 t -0.8.5 0.7-0.7 7.8.7 F 6. 66.8 0.6. 0.3 3.4 4 t 0-3.4-0.4 -.3-3.9-3.5 -.3-3.4 t 0.8 3.8.6-0.4.6 0..8 F 6. 95.6 3.9 7.7 7.6 5.5 9.8 5 t 0 -.6 0.4 -.4 -.8 -.5-0.6 -.5 t -.5-3.8-4 -3.7-5.5-5. -5. F 66.8 95.6 99.4 97.6 3.6 3.9 6.6 9

0 t 0-0.8.3.4 -.5 -.5. -.5 t -0.7 -.6 4 -. 0 3.7.6 F 0.6 3.9 99.4 5.3. 94..3 30 t 0. 3.9.8.5.6.7.7 t 0.4 3.7..3 0.3 3.6 F. 7.7 97.6 5.3 4 56.6 7.9 3 t 0 0. 3.5.5.5 -.6.9 0.8 t -0.7 -.6 5.5 0 -.3 9.9 6.6 F 0.3 7.6 3.6. 4 5.8 3 t 0 -..3 0.6 -. -.7 -.9 - t -7.8-0. 5. -3.7-0.3-9.9-9.8 F 3 5.5 3.9 94. 56.6 5 50.6 33 t 0 0 3.4.5.5 -.7-0.8 t -.7 -.8 5. -.8-3.6-6.6 9.8 F.4 9.8 6.6.3 7.9.8 50.6 Fund 00 4 5 0 30 3 3 33 Fund s j O-R Rankng 00 0-0 0 0-0 4 6 4-0 - 0 - - - - 3 5 0 4 0 3 0 0 0-0 0 0-0 00 4 30 0 4 0 0 0 4-30 0 5 3 0-0 0 0 - - 33-6 3-4 0 3-4 7 33 0-0 - 0 5-6 8 s j 6-6 0-4 - Table 3 An eght-funds portfolo example adjusted for the rsk-free rate. Sample perod: 968-98 Fund 00 4 5 0 30 3 3 33 00 to 3.4.6 0.8 -. -0.. 0 t -0.8.5 0.7-0.7 7.8.7 F 6. 66.8 0.6. 0.3 3.4 0

4 to -3.4-0.4 -.3-3.9-3.5 -.3-3.4 to 0.8 3.8.6-0.4.6 0..8 F 6. 95.6 3.9 7.7 7.6 5.5 9.8 5 to -.6 0.4 -.4 0 -.5-0.6 -.5 t -.5-3.8-4 0. -5.5-5. -5. F 66.8 95.6 99.4 5. 3.6 3.9 6.6 0 to -0.8.3.4 -.5 -.5. -.5 t -0.7 -.6 4 -. 0 3.7.6 F 0.6 3.9 99.4 5.3. 94..3 30 to. 3.9 0.5.6 0.7 t 0.4-9.6..3-8.6 3.6 F. 7.7 45.9 5.3 4 37 7.9 3 to 0. 3.5.5.5 -.6.9 0.8 t -0.7 -.6 5.5 0 -.3 9.9 6.6 F 0.3 7.6 3.6. 4 5.8 3 to -..3 0.6 -. 0 -.9 - t -7.8-0. 5. -3.7 9.3-9.9-9.8 F 3 5.5 3.9 94. 43.3 5 50.6 33 to 0 3.4.5.5 -.7-0.8 t -.7 -.8 5. -.8-3.6-6.6 9.8 F.4 9.8 6.6.3 7.9.8 50.6 Fund 00 5 30 0 3 3 3 33 Fund s j O-R Rankng 00 0-0 0 0 0-0 4 6 4 - - - 0 - - - - 3 5 0-0 3 0 0-0 0 0 0-0 00 4 30 0 0 0 0 0-33 - 5 3 0-0 0 0 0 - - 30-6 3 - - 0 3-3 7 33 0-0 - 0 5-5 8 s j 6-5 - -3 - O-R rankng after the adjustment. The two cases of noncomparable portfolos are resolved usng the δ from Theorem D. We found that portfolo 30 was domnated by both portfolo 5 and portfolo 3. The rankng tself was almost, but not exactly, the same as before, as portfolos 30 and 33 exchanged ther former rankng. 3. A Study of 30 Mutual Funds In ths secton we construct the C f matrx and apply our rankng procedure to a set of 33

portfolos. Except for three portfolos, ths s the same data set analyzed by Lehmann and Modest (987) and Connor and Korajczyk (990). The data set conssts of monthly returns of the funds R through R 30 wth dvdends renvested, over a ffteen year perod from January 968 through December 98. To ths data set we added three more funds wth returns: R 3 = The Standard & Poors 500 Index (SP500 Index) wth dvdends, R 3 = The CRSP equal - weghted Index wth dvdends, and R 33 = The CRSP value - weghted Index wth dvdends. We appled the O-R rankng procedure to the full data set and generated the UF, t o and t values and the 33x33 comparson matrx C. The crtcal F value, F=5.3, corresponds to α=.005. Ths α value was chosen ths small to help control for the over-all error rate (a Bonferron adjustment). The t-tests were performed at α=.0 wth crtcal t-value of.6. We then, estmated the rsk-free rate and transformed C nto C f, resolvng all the cases of noncomparable portfolos. For the rsk-free rate we used the average monthly yeld on T-blls durng the sample perod. We performed separate analyses for three overlappng sample perods of 5 years: 98-978, 0 years: 98-973 and 5 years: 98-968. Because of space lmtatons we cannot present the results of the F and the t tests as we dd for

Table 4 The O-R rankng of 33 funds for the perod of 968-98 Rank 5yrs 0yrs 5yrs Rank 5yrs 0yrs 5yrs Rank 5yrs 0yrs 5yrs 73 73 3 46 0 40 83 9 6 5 05 05 73 47 6 9 74 9 66 93 6 3 5 9 48 63 5 37 93 46 90 65 4 9 5 4 49 9 07 09 94 59 3 7 5 4 68 6 50 4 00 90 95 69 5 84 6 7 6 5 5 9 47 96 3 46 4 7 6 88 99 5 04 48 5 97 75 8 8 5 5 88 53 07 43 7 98 86 08 36 9 88 7 63 54 5 9 99 36 69 8 0 44 6 5 55 7 3 4 00 09 9 80 6 4 59 56 77 0 7 0 0 80 38 3 8 44 57 38 3 00 0 65 94 4 3 5 5 5 58 5 6 9 03 5 6 3 4 8 8 8 59 0 4 3 04 66 67 5 8 5 05 60 4 0 05 97 6 06 3 8 6 77 6 06 9 8 6 7 4 44 6 6 85 7 3 07 8 9 56 8 99 87 06 63 40 45 4 08 4 3 69 9 89 99 7 64 96 37 9 09 30 65 0 39 06 65 45 7 76 0 70 0 86 87 33 3 66 53 85 3 8 78 60 3 30 67 7 43 97 8 6 3 39 04 68 0 35 46 3 0 49 9 4 74 33 69 33 3 0 4 78 8 75 5 68 89 9 70 3 38 7 5 3 97 08 6 33 60 3 7 50 9 5 6 8 6 93 7 3 64 7 35 98 35 7 49 75 0 8 64 83 57 73 57 33 3 8 3 86 0 9 74 74 3 79 98 9 8 8 30 03 57 9 75 6 0 94 30 8 3 83 8 6 76 67 7 90 6 7 3 30 87 77 4 96 07 9 4 94 33 9 4 89 78 9 79 3 0 09 70 34 8 0 60 79 7 0 0 4 6 95 95 35 76 76 45 80 93 7 5 34 0 58 36 5 63 9 8 4 49 6 95 0 30 37 7 4 68 8 79 67 48 7 6 70 4 38 48 7 8 83 50 03 8 54 34 34 39 84 3 77 84 98 53 4 9 58 0 40 3 03 39 85 37 56 85 30 7 7 78 4 3 30 40 86 4 59 53 3 58 8 9

4 43 04 5 87 5 4 96 3 55 54 54 43 9 9 64 88 80 36 66 33 55 55 44 47 47 50 89 08 3 33 45 00 84 3 90 56 6 7 the eght fund example. Nor can we present the entre 33x33 C f matrx. Therefore, we only present the O-R rankngs of the 33 funds. The results of the full rankng over the 5, 0 and 5 year perods appear n Table 4. It s nterestng to note that the three ndexes we added to the orgnal data set dd not fare too well. For example, for the 5 year perod between 968 and 98 the SP500 Index (fund number 3) s ranked 40. The CRSP equal-weghted ndex (fund number 3) s ranked 5 whle the CRSP value-weghted ndex (fund number 33) s ranked 69. As mentoned above, the full 33x33 C and C f matrces were too large to nclude n ths paper. In order to llustrate the nformaton contaned n the C f matrx, however, Table 5 presents the tests results that lead to the creaton of one vector n ths comparson matrx namely, the column correspondng to the SP500 Index. We chose to present ths partcular column because the SP500 Index s commonly employed as the benchmark portfolo n emprcal analyses done n the fnancal lterature as well as by practtoners n fund management evaluatons. Table 5 s dvded up nto three parts. All the parts dsplay the fund number and the number of observatons n the regresson of Theorem A, wth the SP500 Index as portfolo j (j = 3) and the ndcated fund as portfolo,.e, Y t = R 3,t - R,t and X t = R 3,t + R,t, =,...,33. The UF, t 0 and t values are also gven. The left part contans all the funds that were domnated by the SP500 Index durng the 5 years sample perod and hence, the 3st column of the matrx C f wll have the entres COMP f (,3) = for the correspondng row of fund. The central part contans all the funds wth UF-statstc values less than the crtcal value of 5.3. These are the funds that were equal to the SP500 Index and therefore, COMP f (,3) = 0 for these funds. The rght part of the table shows the portfolos that domnated the SP500 Index durng these 5 years, mplyng that COMP f (,3) =-. Table 5 shows that durng the ffteen years between 4

Table 5 Comparng 3 Portfolos Aganst The SP500 Index Durng The Perod 968-98 The SP500I Domnates Fund The SP500I Equals Fund Fund Domnates The SP500I Fund NOB F t 0 t Fund NOB F Fund NO B F t 0 t 37 80 5.33-0.07 3.6 7 80 0.03 8 80 78.04-0.9 -.46 79 80 5.5.0 3.6 47 80 0.04 3 80 77.6-0.84 -.4 4 80 6.6 -.97. 48 80 0. 73 80 76.9 -.7 -.4 4 80 7..4 3.04 00 80 0.6 05 80 74.8 0.3 -.3 98 80 8.3-0.68 4.0 3 80 0.3 7 80 58.87 0.06-0.85 93 80 8.96.59 3.9 07 56 0.38 5 80 57.66-0.33-0.73 7 80 9.8-0.54 4.4 4 80 0.39 6 80 43.0-0.0-9.7 80 80.7.35 4.65 6 80 0.5 5 80 4.3 0.49-9.7 56 80 0. 4.89 63 80 0.7 80 35.79-0.5-8.46 69 78.03 -.3 4.73 5 80 0.76 44 80 35.63-0.77-8.4 36 80.69-0.67 4.99 0 80 0.84 88 80 3.95.37-8 8 80 5.7-0.36 5.5 9 80 0.9 9 80 6-0.48-7. 5 80 5.8 -.7 5.38 45 80 0.96 80 4.48 -.56-6.8 4 80 5.5 -.8 5.45 76 80 -.0 4 80.46-0.63-6.67 8 80 7.07 0.85 5.78 80.05 6 80.0 0.57-6.6 46 80 7.35 -.8 5.77 0 80.4 06 80 0.4.4-6.3 6 80 7.7 -.6 5.84 78.5 5 80 4.87 0.7-5.45 66 80 9.9-0.47 6.9 9 80.5 99 80 4.79 -.7-5.3 97 6 0.67-0.9 6.36 83 80.56 89 80 4.07 0.49-5.8 33 80.8 0.76 6.56 35 80.65 60 80 4.03-0.33-5.9 75 80.87 -. 6.3 77 80.69 8 80 3..76-4.83 80.56 -.48 6.55 7 56.77 4 80 7.56 3.53 -.64 65 80 3.38-0.66 6.8 6 80.77 3 80 3.6. 6.78 43 80.84 80 4.43.68 6.78 53 80.88 90 80 4.74-0. 7.03 3 44.89 3 80 4.95 0.06 7.06 7 80.94 86 80 5.0 -.6 6.88 85 74.97 0 76 5.5-0.45 7.3 5 80.06 94 80 6.7 -.8 7.08 40 80.6 0 80 8.3 0.53 7.5 8 80. 78 80 8.8-0.6 7.57 80.3 8 80 8.99 0.69 7.58 96 80.35 6 77 9.9.0 7.67 03 80.57 9 80 3.46 -.5 7.9 84 80.65 49 80 34.7-0.96 8.8 50 80.67 9 50 35.7 -.3 8.3 3 53.8 08 80 40.03.96 8.73 9 80.9 95 77 4.9-3. 8.5 0 80.98 8 44 4.96-3.3 8.57 68 80 3. 54 80 4.63 0. 9.3 38 80 3.4

30 80 43.6 -.6 9.5 67 80 3.59 59 80 43..6 9.04 7 80 3.8 6 80 44.4-4. 8.4 30 80 3.95 09 80 46.7. 9.37 33 80 4.5 58 66 50.9 0.04 80 4. 3 80 5.03.9 9.9 3 80 4.4 0 80 5.48 0.44 0.4 64 80 4.5 70 80 5.7 0.47 0.6 9 80 4.5 34 80 53.3-0.55 0.9 87 80 4.6 80 54.5.06 0.36 57 77 4.76 7 80 70.68-0.3.89 39 80 5.0 80 0.5 0 4.87 74 80 5.06 9 5 80 3.6 4.5 5.5 4 80 5.3 55 80 4.9-0.04 5.8 04 80 5.79 6

968 and 98, the SP500 Index domnated 55 of the funds, was equal to 55 of the funds, and was domnated by funds. Notce that out of these funds, fund number 4 showed a statstcally hgher mean return than the mean return of the SP500 Index. The other funds, domnated the SP500 Index based on lower varances. More mportantly, perhaps, the table shows that no fund was noncomparable to the SP500 Index durng the sample perod. Ths s somewhat puzzlng. If the SP500 ndex was an effcent portfolo durng the sample perod, we would expect t to domnate or be noncomparable to all other funds. The results n Table 5 cast some doubt as to the MV effcency of SP500 Index. We found that the same was true of the two CRSP ndces. Shanken (985), Green (986) and others reported smlar results. We can now llustrate the dfference between orderng wth respect to one benchmark portfolo such as the SP500 Index and our method of orderng based on all the parwse comparsons. Consder two funds that "look lke" the SP500 Index. For example, funds 0 and 3, are statstcally equvalent to the SP500 Index (see the md-secton of Table 5). Usng one benchmark, both portfolos 0 and 3 would rank the same. Table 4, however, reveals that usng the O-R Method the SP500 ndex s ranked 40, whle fund 3 has rank 7 - better than the SP500 Index, and fund 0 has rank 68 - worse than the SP500 Index. Ths example provdes evdence to the clam that rankng relatve to an neffcent fund s not consstent wth rankng based on domnance. Ths result mrrors a result of Dybvg and Ross (985a, p.388), although they used populaton parameters for ther demonstraton. 4. An Emprcal Comparson of The Rankngs Robustness Gven a sample of portfolo returns, R t, =,...,N, t =,...,T, the returns on a proxy for the market portfolo, R mt and the rsk-free rate R f, the Treynor, T, the Sharpe, S, and the Jensen, J, portfolo performance measures for portfolo are: R - R f T = ; S = R - Rf ; J = R b σˆ - R f - b ( R m - R f ), where bars denote the sample averages, σˆ and b denote sample estmates of the -th portfolo's return varance and the portfolo beta, respectvely. We estmated these measures wth the data of the 33 funds, usng (separately) the three market ndexes as proxes for R m and the average yeld on one month T-blls for the rsk-free rate. We then ranked the portfolos usng each one of these measures for each of the perods and for each of the market ndexes. In order to nvestgate the stablty of these rankngs over tme, we agan, dvded the 5 year perod nto 3 overlappng perods: 5 years (978-98), 0 years (973-98) and 5 years (968-98). Each of these perods was treated separately and our rankng and those of the other three methods were computed. The rank correlatons were then computed both wthn perods and between perods for all methods and for dfferent levels of sgnfcance. Agan, the three dfferent benchmark portfolos, the SP500 ndex, and the two CRSP ndexes were used n the computatons of T, S, and J. The results were very smlar so Table 6 exhbts the rank correlatons obtaned wth the SP500 Index as the proxy for R m. The rank correlatons among the four methods wthn the same tme perod appear n 4x4 portons along the man dagonal of the table. These portons are for 5 years at the upper-top left, for 0 years n the mddle and for 5 years at the lower-bottom rght of the table. The off-man dagonal portons of the table show the between-tme perods rank correlatons. The correlaton value s accompaned wth ts p-value below t n order to enable the reader to assess the valdty of the correlaton. The table reveals that wthn each tme perod the rankngs of the Treynor, the Sharpe and the Jensen measures were vrtually the same, wth rank correlaton values rangng from

.9 to.99. There seems to be lttle or no smlarty between ther rankngs and the O-R rankngs wth rank correlaton values between.06 and.3. Furthermore, Table 6 8

Table 6 Rank correlatons Among the Portfolo Performance Measures* 5 Years 0 Years 5 Years 5 Years Sh ar pe Jensen Treyno r O-R Sharp e Jensen Treyno r O-R Sharp e Jensen Treyno r O-R Sharpe.0 00 0 0.0 00 0 Jensen 0.9 86 0.0 00 Treynor 0.9 97 0.0 00 O-R 0.3 3 0 0 Years 0.0 00.000 0 0 0.986 9 0.96 5.000 0 0 0.304 7 4.0000 0 9

Sharpe 0.8 7 0 0.0 00 Jensen 0.8 49 5 0.0 00 Treynor 0.8 67 7 0.0 00 O-R 0. 09 9 0.0 00 3 0.849 8 0.848 3 0.843 4 0.30 7 4 0.864 0.844 8 0.860 3 0.305 4 0.667.0000 0.055 0 0.40 0.9845.0000 0.95 0 0.564 0.9980 0.980.0000 0.07 0 0.946 0.078 0.65 0.975.0000 0.064 0.063 0.07 0 5 Years Sharpe 0.4 69 0.0 00 Jensen 0.4 80 0.0 00 Treynor 0.4 93 0.45 7 0.493 0.483 0 0.456 9 0.476 5 0.490-0.530 0.6545 0.667 0.6586 0.58.0000 0.0033 0.069 0 0.3495 0.653 0.6946 0.6558-0.444 0.94.0000 0.0046 0 0 0.6634 0.675 0.668-0.467 0.9759 0.945.0000 0.0 3 0.09 0 0

00 O-R 0. 05 0.0 7 8 0. 0.04 7 0.05 8 0.07 5 0.74 0.309 0.075 0.36 0.80-0.0643-0.649-0.06.000 0 0.33 0.8 0.564 0.460 0.0587 0.4773 0 * The SP500 Index s the benchmark portfolo for the Sharpe, the Jensen and the Treynor Measures. The p-value appears below the rank correlaton.

shows that the between-perods rank correlatons of the Treynor, the Sharpe and the Jensen measures drop relatvely quckly n spte of the perods beng overlappng. These values drop from.99 wthn each perod to around.85, between 5 and 0 years,.66, between 0 and 5 years, and.47, between 5 and 5 years. For the same tme perods, the O-R rankng has rank correlaton values of.946, between 5 and 0 years,.80, between 0 and 5 years, and.74, between 5 and 5 years. These results ndcate that the O-R rankngs show greater stablty over tme than the rankng of the other three methods. In the next secton we provde a theoretcal explanaton of ths greater level of stablty over tme of the O-R rankng, as well as an explanaton for the stark dssmlarty between our rankng and the others. 5. Some Theoretcal Connectons Among the Rankng Procedures We saw that whle the Treynor, the Sharpe and the Jensen measures ranked the funds vrtually the same, our method had statstcally no (or negatve) correlaton wth ther rankngs. In ths secton we argue that these results are to be expected. We focus frst on the smlarty of the Sharpe, the Treynor and the Jensen measures. Let R, ˆσ, b, R and V m m represent the sample values over the perod under dscusson, of the average return of fund, ts varance, ts regresson slope aganst the reference portfolo m, the average return of the reference portfolo and ts varance. T, J and S denote the -th fund score of Treynor, Jensen and Sharpe when usng sample estmates. Defnton 7. A portfolo whose return varance s completely explaned by the return varance of a reference portfolo m, s sad to be a well dversfed portfolo (WDP) wth respect to the reference portfolo m. Notatonally, a portfolo s a WDP f σˆ = b Vm.// We now show that for WDPs, the rankngs of the three methods are the same: Theorem E. Let R f exst and R m > R f. Let R and R j be WDP's. Unless portfolos and j are noncomparable, the rankng of these portfolos wll be the same usng the Sharpe, the Treynor or the Jensen scores. Proof: For WDPs the Sharpe score s proportonal to the Treynor score and any nequalty nvolvng the varances s equvalent to one nvolvng the slopes. So we need only consder Treynor's measure. To complete the proof we show that the rankng of Treynor and Jensen are the same when the portfolos are not noncomparable. The case R = R j and b = b j s obvous snce T - T j = 0 = J - J j. Next, consder the case R R j and b b j wth at least one strct nequalty. By computaton, T > T j. Snce J - J j = R - R j - ( b - b j )( R m - R f ), at least one term on the rght s postve and none s negatve. Therefore, J > J j. Lastly, suppose that b b j and R R j, wth at least one strct nequalty. Snce ths s exactly the same stuaton as the prevous one wth j replacng, the same concluson follows. The next theorem shows that the ranks correspondng to the Jensen and the Treynor scores are the same even for some of the noncomparable portfolos and therefore, all three rankngs are the same. Lemma F. T - T j = J /b - J j /b j Proof: Usng the defnton of T : f T = R - R R - R f - b ( R m - R f ) J = + ( R m - R f ) = m f b b b + ( R - R ). It then follows that T - T j = J /b - J j /b j Theorem F. If R < R j and 0 < b < b j and f n addton T - T j < 0, then J - J j < 0.

Proof: Snce T < T j f follows from lemma F that J /b - J j /b j < 0 or that J < b j b J. j Snce b < b j the result follows. Theorem G. If R > R j and b > b j > 0, and f n addton T - T j > 0, then J - J j > 0. Proof: As n the last proof T - T j > 0 mples that J > b J j and the result follows. b j If we assume that mutual funds are WDPs then, theorems E, F and G mply that the Sharpe, the Treynor and the Jensen performance measures should yeld the same rankngs except for some noncomparable funds. In the prevous secton, Table 5 demonstrated that no fund was noncomparable to the SP500 Index n the sample perod. Thus, the explanaton for the hgh correlaton among these rankngs reflects the small number of noncomparable cases present n our sample. Of course, when the rsk-free rate s assumed to exst, the O-R procedure employs the matrx C f and resolves all the cases of noncomparable portfolos. The low correlatons wth our rankng also reflects the large number of funds that were found by our method to be equal to the SP500 Index. Ths last comment underscores the man statstcal problem wth these methods. They treat the sample estmates as f they were populatons parameters and therefore, they rank few or no portfolos as equal. Theorem H below shows that f the O-R method also measured domnance usng these estmates as f they were populaton parameters, all four methods would yeld the same rankng. Clearly, we must assume the exstence of a rsk-free rate n order to compare our method to the other three measures. Thus, as n Theorem D above, comparng R j aganst R should lead to * the same concluson as comparng R j aganst R = δ R + (- δ) Rf wth δ = (R j - Rf )/(R - Rf ). * Then, all the parameters and the parameters estmates wth a super * correspond to R. Theorem H. Let a rsk-free asset be avalable such that each nvestor may borrow or lend at the rsk-free rate, R f and assume that R m > R f. Then, * * * a) S > S j ff there exsts R : R R j and σ σ j wth at least one strct nequalty. * * * b) T > T j ff there exsts R : R R j and b b j wth at least one strct nequalty. * * * c) J > J j ff there exsts R : R R j and b b j wth at least one strct nequalty. * Proof: a) Let S > S j and choose δ such that R = δ R + (- δ) Rf = R j. Then, S > S j mples that δ σˆ < σˆ j. But δ σˆ s the standard devaton of R. The converse s straghtforward. b) The proof for part b s the same as n a) wth b replacng σ. c) Let J > J j. Then, R - R f - b (R m - R f) > R j - R f - b j(r m - R f). Choosng δ as n part a) and substtutng n the last nequalty yelds δ b < b j. But δb s the sample regresson slope * of R. The converse follows by substtuton. Suppose the mean returns, the returns varances and covarances are estmated. And suppose that nstead of usng our statstcal tests, the MV domnance crteron s employed usng the sample estmates to decde on MV domnance. Then, Theorem H demonstrates that when a rskfree rate exsts, the MV domnance crteron and the Treynor, the Sharpe and the Jensen performance measures yeld the same rankngs. Ths result sheds some lght on the reasons for the lack of correlaton between these rankng and the O-R rankng. Frst, the O-R method does not take the sample statstcs at face value, but rather uses them to test the hypothess of domnance based on the regresson of Theorem A. Therefore, ths method leads to many tes,.e., portfolos that are statstcally too close to warrant rankng them dfferently. All such portfolos are deemed to be statstcally equal to one another by our method. For example, Table 5 revealed that 55 out of the 3 funds n our study were judged to be equal to the SP500 Index. Analyzng the same data and usng the sample statstcs as f they 3

were the true parameters, the Treynor, the Sharpe and the Jensen measures gave dfferent rankngs to these 55 funds relatve to the SP500 Index. Secondly, Sharpe's measure ranks funds based on statstcs computed from each fund separately, whle the Treynor and the Jensen measures rank funds based on statstcs computed for each fund that are corrected by a thrd fund, namely, the reference portfolo. We already saw, however, that two funds, 0 and 3, whch appeared to be equal when compared only to one fund, the SP500 ndex, receved very dfferent rankng when judged relatve to all the funds n the set. Ths example hghlghts the fact that the O-R procedure may deem portfolos equal to one another n the MV sense, yet rank them dfferently. Ths s so, because the O-R rankng functon s defned over a much larger nformaton set. Ths set contans not only the MV domnance relatonshp between some benchmark portfolo and the rest of the portfolos, but the MV relatonshp of every portfolo relatve to all other portfolos n the set. In fact, the more funds under consderaton the less one would expect that rankng relatve to one fund would sustan tself when comparsons are done relatve to all funds. 6. Dscusson and Conclusons The purpose of our paper s to present a method to evaluate the performance of portfolos when the underlyng parameters of the dstrbuton are unknown. To accomplsh ths we ntroduced a sequence of hypothess tests concernng portfolos returns parameters and used the results of these tests to create an orderng of performance. There are several advantages of ths method of orderng. Frst, each test, and therefore the collecton of tests, speaks to and about the true parameter values of the portfolos. Suppose, for example, a test that correctly concluded that the true parameter values of portfolos and j were n the relatonshp µ > µ j and σ < σ j. Then, Theorem F mples that the rankng usng any of the three other methods, usng populaton parameters, would place fund over fund j. Conversely, f the sample values were used as f they were true n the Sharpe, the Treynor or the Jensen scores and ranked fund over j, then the rankng based on the data would not mply the same rankngs as f populaton values had been used. If the populaton values equated the funds, the rankng based on the data could be just the result of nose. A second advantage les n the fact that the O-R method uses all the NxN (mnus the N trval) parwse comparsons, renderng gamng possbltes vrtually mpossble even when the dentty of a benchmark portfolo s known to the manager. Evaluatng a fund's performance aganst one reference fund leads to easer gamng possbltes. Thrd, the ordnal nature of our procedure of orderng, may tself be advantageous. As mentoned at the outset, the numbers and cardnalty of ranks mpled by some of the other methods have come under severe crtcsm by Roll (977, 978) and Dybvg and Ross (985a, 985b). The ordnalty of our method survves some of the crtcsm. For example, the theorems of the last secton connectng the varous rankng alternatves are true no matter whch reference portfolo s used or whether the reference portfolo s effcent or not. Ths result should be compared to the dscusson by Dybvg and Ross (985a) on a conjecture of Roll; see Secton III of ths reference. Fourth, the only assumpton needed for the O-R method s that the returns on portfolos are bvarate ellptcally dstrbuted wth statonary dstrbutons. We do not requre as others do, an effcent benchmark portfolo and a lnear model specfyng the relatonshp of each portfolo to ths benchmark portfolo. We emphasze, however, that should such models exst and an effcent portfolo be used, then t would be detected by our method because such a portfolo could never be domnated. Furthermore, n equlbrum, all other effcent portfolos would be noncomparable to the benchmark portfolo. In ths case, all the portfolos would 4

have no portfolos domnatng them, and they all would te for rank number one. Ffth, the O-R method allows nvestors to evaluate portfolos over relatvely short tme perods. To elaborate ths pont, we mght ask what s mpled about the management of the funds f a fund ranks hgher than another by our method? The data set here does not contan the nformaton that the funds had the same managers over the sample perod and there may have been dfferent managers wthn the perod for some funds. So the test cannot refer to a partcular manager's performance. If a partcular manager at a partcular perod could tme the market, t s not known whether that s temporary or permanent. If t s temporary, the process would be nonstatonary and so the test need not necessarly refer to tmng. Smlarly, snce we do not know f at some specfc tme some manager had specfc nformaton about a specfc stock, the test cannot refer to ths possblty. Yet, the evaluaton of a superor manager seems to be mportant to nvestors. Ths means that nvestors may wsh to be able to rank portfolos as frequently as every year or every sx months, or even every quarter. Our method s mplementable even n the latter case because wth daly returns,e.g., the O- R procedure has enough observatons for the regressons and the hypothess tests of Theorem A every quarter. The robustness of the O-R procedure to the use of daly vs-a-vs weekly or monthly data s currently under nvestgaton. Notce that the O-R procedure may be used to test the objectve of the CREF fund quoted at the outset. Here, all managers of that fund presumably shared the goal of seekng a portfolo whch domnated a partcular reference fund. Our method permts an evaluaton of ths clam. On the other hand, were we to know that a partcular manager was n place over the observed perod, then the test results could be credted to that manager. Or f a fund clams to have the ablty to consstently ferret out specfc nformaton about specfc stocks and to be able to use ths nformaton to ts advantage, these tests could be used to test ths clam. The lterature has addressed two types of nformaton that a manager mght use to the advantage of the fund: tmng nformaton on the reference fund and specfc nformaton of the mean of a partcular fund (the expected value of the error term of a lnear model beng unequal to zero). Of course, nformaton could be of a more complcated form nvolvng the dstrbutons of the reference portfolo or of the error. The poston we have taken here s that the nformaton, f t s avalable, must lead to an mproved mean-varance poston of that fund relatve to funds that do not have ths nformaton. Snce nformaton could be avalable to more than one fund, t s only n the multple comparsons of performance that a fund can be evaluated. Our procedure s constructed to be consstent wth ths vewpont. Fnally, t s mportant to realze that our rankng procedure s not ntended to be used as a portfolo selecton method. Nor s t a gudance only for nvestors who wll nvest ther entre captal n one portfolo vs-a-vs nvestors who wll nvest n some combnaton of several portfolos. At the end of the nvestment horzon, both types of nvestors wll need a performance measure. The O-R procedure s desgned to rank the ex post performance of any combnaton of portfolos. REFERENCES Anderson, T.W. and Fang, K.T.,(990), "Inference n Multvarate Ellptcally Contoured Dstrbutons based on Maxmum Lkelhood,"Statstcal Inference n Ellptcally Contoured and Related Dstrbutons, edted by Fang, K.T and Anderson, T.W, Allerton Press. Baley, V.J., (99), Evaluatng Benchmark Qualty, Fnancal Analysts Journal, 48, 33-39. 5