Adaptive rank tests for location with generalizaed Lambda distribution scores 1

We propose adaptive rank tests for the location alternative in one sample, using as score function the percentile function of the Generalized Lambda Distribution (GLD). We give expressions for its efficiency as functions of the kurtosis parameters of the distribution used for the score function and those of the sampled distribution. A simulation study shows that the proposed tests maintain its nominal size and that this test using scores functions with small kurtosis parameter, are very efficient for samples coming from distributions with large kurtosis, overtaking the sign test and the Wilcoxon test. Reciprocally, tests which use scores from GLD distributions with large kurtosis are more efficient when the sample comes from GLD distributions with small kurtosis


Introduction
Let X 1 , ..., X n be independent random variables coming from a continuous distribution F (x − θ).The problem to consider is the test of the hypothesis: We have chose the one sided alternative although the results are valid also to the two sided alternative.The most common test under these conditions is the sign test.When the symmetry of the sampled distribution can be justified, the Wilcoxon signed rank test is more powerful than the sign test and its power tends to increase with the kurtosis of the sampled distribution (Aranda & Corzo 2002).Score functions to build locally most powerful rank tests are obtained using the density f of the sampled distribution by means of the ratio −f /f .However, this type of score function by the lambda family is not a nondecreasing function.Instead of this we will focus our interest to build rank tests using the percentile function of the GLD.The idea comes from the known result that for the normal distribution the optimal scores are obtained through its percentile function Φ −1 (u).These scores were proposed early by Fraser (1957).The normal score function has been used in many other contexts: for the two sample location alternative (Waerden 1952(Waerden /1953)); for analysis of variance (Fisher & Yates 1938); scale alternative in two sample problem (Klotz 1962); recently by Brown & Hettmansperger (1996) for various departures from normality, and more recently in survival analysis by Li & Zhang (2011) among others.
We give general expressions for the efficiency of the proposed test and we compare it with various versions of it and with the sign and Wilcoxon signed rank tests.We show by a simulation study that tests obtained using the percentile function of the GLD distribution are more efficient than the sign test and the Wilcoxon signed rank test.
Our tests are natural candidates for test problems coming from applications where is necessary to fit the GLD to the studied variables.Such applications are frequently in fields such as Science, Engineering, Costs, Mechanics and Materials Sciences, Environment and Reliability Analysis, as was noted in the recently published handbook about the fit of probability distributions by Karian & Dudewicz (2010) in which the theory and the methodology to fit the GLD and other extensions of it are discussed.You will also find in the handbook a lot of applications in the following situations: to fit GLD probability density functions to small samples; to fit the distribution of a mixed truncated random variable and its uses to find the optimal deductibles in the purchase of an automobile insurance and to optimize the order size maximizing the expected utility of an investor in an inventory model; to model the distribution of the repair costs of pipeline linkage in a water utility company; to model distributions of materials damage; to fit a GLD to the fatigue lifetime distribution, brittle fracture, and extreme values analysis; and to model the distribution of environmental pressure, just to mention some.It can be find also another list of applcations and uses of the GLD at the beginning of section 7 of Karian & Dudewicz (2010).
In section 2 the proposed test statistic and the conditions under which the percentile function of the GLD distribution produces well-defined score functions are discussed.Expressions for the two first moments are given and the convergence of the distribution of the test statistic under H 0 to the normal distribution is proved.In section 3 the efficiency of the proposed test is computed and is compared with the efficiency other tests.In section 4 the results of the simulation study are presented and the section 5 contains some conclusions and a discussion.

The proposed test statistic
To introduce the scores of test statistic we need the percentile function of the GLD defined by (see (Karian & Dudewicz 2000)): where λ 1 , λ 2 , λ 3 , λ 4 are location, scale, skewness and kurtosis parameters respectively.
Without loss of generality, we will suppose that λ 1 = 0 and we will take λ 3 = λ 4 , so that F −1 (y) comes from a symmetric distribution whith percentile function For a nonnegative, nondecreasing function φ(u), 0 < u < 1, with , where S(x) = 1 when x > 0, 0 otherwise and R j is the rank of |X j |.
To obtain 2 .Now from (2), we obtain: (5) Although definition (3) uses F and it has no explicit expression, this is not necessary because the test statistic depends only on F −1 .The proposed GLD scores test based on Vλ rejects H 0 in favor of H A for large values of Vλ .As a special case, F −1 = Φ −1 produces the (Fraser 1957) normal scores test statistic.The validity conditions of the score function are given in the following theorem, whose proof is given in Appendix A.1.
Theorem 1.Let φ be as in (5).Then φ is nonnegative and nondecreasing when λ 2 and λ 4 have the same sign.Moreover, and where σ 2 is the variance of the GLD distribution used as score function (see Appendix A.2).
Following (Hettmansperger 1984, p. 88), it can be verified that has asymptotically a normal standard distribution.

Asymptotic relative efficiency of the proposed test
To compare two tests based on test statistics n , i = 1, 2 for H 0 : θ = 0 vesus H A : θ > 0, we use the following known expression for the ARE of test 1 relative to test 2 satisfies(see for example (Hettmansperger 1984)): where c i is called the Pittman efficacy of the test based on One common expression for the Pittman Efficacy is: , where µ n (0) = d dθ µ(θ)| θ=0 , and µ n (θ) and σ n (θ) are the asymptotic mean and the asymptotic variance of V n respectively (see for example (Hettmansperger 1984), (Manoukian 1986)).For a more general expression of e 12 see (Govindarujulu 2011).As a derivative, c measures the rate of change in the asymptotic mean of V n in standard units.It is to be expected that the greater the rate of change, the greater the sensibility of the test to alternatives near to the null hypothesis.To compute c i , i = 1, 2 in (8) we will use two expressions taken from Hettmansperger (1984), which permit us to calculate them, through the percentile function of the GLD.Now we will obtain expressions for the efficacy for Vλ tests, for the sign test and for the Wilcoxon signed rank test, when the density function of the sampled distribution is approximated by a GLD distribution.Proofs are given in appendix A.3.
Theorem 2. Let X 1 , . . ., X n be a random sample from a continuous symmetric distribution F (x − θ) with median θ and density f .When F comes from a GLD distribution with parameters (λ 1 , λ 2 λ 3 , λ 4 ), the efficacy of Vλ as it is defined in (4) is: where λ 4 is the parameter of GLD used as score function, σ f is the standard deviation of F , g(λ 4 ) is obtained as g(λ 4 ) in (10) and β(•, •) represents the β function. Alternatively: In the expressions above, it is clear that the efficacy depends on λ 4 , the kurtosis parameter of the GLD distribution used for the score function, on λ 4 , the kurtosis of the sampled distribution and on σ f , the standard deviation of the sampled distribution F .
Corollary 2.1. .Let X 1 , . . ., X n be a random sample from a continuos symmetric distribution F (x − θ) with median zero and let Vλ be a GLD scores test.If Vλ uses as score function the same GLD distribution from which the sample comes, then c = 1/σ f and it coincides with the efficacy of the t-student test.
Theorem 3. Let X 1 , . . ., X n be a random sample from a continuos symmetric distribution F (x − θ) with median zero.If the density f of the sampled distribution is approximated by a GLD distribution with parameters λ 2 and λ 4 , the efficacies of the sign test S and of the signed rank test T are: The most important consequence of Theorem 4 is that the ARE of any two GLD scores tests depends only on λ 4 which is the kurtosis parameter of the GLD distribution used for the score function and on λ 4 the kurtosis parameter of the sampled distribution (see (A.2) to compute the kurtosis parameter).
Corollary 4.1.Let X 1 , . . ., X n be a random sample from a continuos symmetrical distribution F (x − θ) with median zero.The ARE between some par of tests that includes the GLD scores test, the sign test, the signed rank test or the t-student test is scale invariant.
Theorem 5. Let X 1 , . . ., X n be a random sample from a continuos symmetrical distribution F (x − θ) with median zero.The efficacy of a GLD test Vλ with scores from a uniform distribution on interval (a, b) is independent of a and b, and it equals the efficacy of the signed rank test, which implies that they have the same efficiency.
Thinking about a class of tests whose score functions are percentile functions, Theorem ( 5) means that within this class, the Wilcoxon test corresponds to the Vλ test, which uses as score function the percentile function of the uniform distribution.
4 Numerical results for efficiencies of the proposed tests.
The expressions given in ( 9) and ( 11) to compute the efficacy of the GLD scores test Vλ have integrals that have no analytic solutions.It was necessary to approximate them numerically to have values of the ARE for the comparisons among the tests.
We compare 20 GLD score tests under the established conditions: λ 2 and λ 4 with the same sign, λ 2 = 0 and λ 4 > − 1 2 .Seven of them have score functions coming from the following known symmetric distributions with median zero, when they are approximated by the GLD distribution: uniform distribution with b = −a = √ 3, normal distribution with µ = 0; σ = 1, t-student with 30, 10 and 5 degrees of freedom, Logistic distribution µ = 0; σ = 1 and Laplace distribution with σ = 1.All parameters have been obtained from Karian & Dudewicz (2000) tables.The other 13 tests were build with GLD scores by fixing values of λ 4 and making σ = 1 without loss of generality, because the ARE is scale invariant.
Table 1 in the appendix contains the values of λ 2 , λ 4 , the kurtosis and the standard deviations used to construct the proposed tests.The values of λ 2 are obtained from λ 2 = 1 σ g(λ 4 ) for g(λ 4 ) as in (A.2).Only the values of λ 4 > −1/4 are shown because for − 1 2 ≤ λ 4 ≤ − 1 4 the score functions are not well-defined.Note that the kurtosis decreases with λ 4 up to λ 4 = 1, 45 and then increases but now λ 4 decreases.For the uniform and for the GLD(2) functions, only the value of the parameter λ 4 changes.
Table 2 shows the parameters of the GLD distributions with which some known distributions were approximated, and of those GLD distributions used as sampled distributions.

Efficiencies
As examples, we have computed efficiencies of GLD scores tests which use percentile functions GLD(1.45),GLD(50) and GLD(−0, 249) to compare them with other GLD scores tests, with the Fraser (normal scores) test and with the Wilcoxon signed rank test.These efficiencies are in tables 3 to 7. Table 3, for example, contains the efficiencies of a test with score function GLD(1.45)relative to nineteen GLD scores tests indicated in the rows, for samples coming from 10 distributions, indicated in the columns.It can be noted that the efficiencies of the GLD(1.45)scores test increase for samples coming from distributions with larger kurtosis than the normal distribution (see the last six columns of the table).On other hand, for samples coming from distributions with lower kurtosis than the normal distribution, the efficiencies of the GLD(1.45)scores test decrease (first three columns of the table).For samples coming from the normal distribution, only the efficiencies of the GLD(1.45)scores test relative to the GLD(50), GLD(25) and GLD(10) scores tests are greater than one.The same comments are valid for the efficiency of the GLD(1.45)scores test relative to the t-test used for calibration.In all cases, the GLD(1.45)scores test have higher efficiency than the sign test and almost the same efficiency as the Wilcoxon test.
In the first two columns of Table 4, it can be seen that the efficiencies of the GLD(50) scores test relative to the other GLD scores tests up to the logistic scores test are greater than one, for samples coming from the GLD(0.7)and GLD(0.5)distributions.The same comment holds for the efficiencies of the GLD(50) scores test relative to the sign, Wilcoxon and t-tests.The GLD(50) scores test is also more efficient than the sign test for samples coming from the GLD(0.3)distribution.
In Table 5, it can be noted that the efficiencies of the GLD(−2.49)scores test relative to all other GLD scores tests are much greater than 1 for samples coming from GLD(0.7), and they are greater than 1 for samples coming from GLD(0.5) and GLD(0.3).

Practical guidelines
To use the proposed tests we suggest the following steps: Find the best GLD fit for your data, using any of the methods to calculate the values of the lambdas suggested by Karian & Dudewicz (2010) in chapters 3, 5, 6, 7 and 10.
If the fitted GLD is a symmetric distribution, proceed to identify if it is more leptokurtic than the normal distribution and use the GLD(1.45)scores test.
If the fitted GLD is a symmetric distribution, proceed to identify if it is less leptokurtic than the normal distribution and use the GLD(50) or the GLD(2.49)scores test.
If the fitted GLD is not a symmetric distribution you can not use directly the GLD scores test.When possible, transform the data to get data with a symmetrical distribution, and go to steps 1 to 3 with the transformed data.

Conclusions
As was pointed out, the GLD(1.45)scores test is the best test for samples coming from distributions that are more leptokurtic than the normal distribution.All other proposed tests are more efficient than the GLD(1.45)scores test for samples coming from distributions less leptokurtic than the normal distribution.The GLD(50) and GLD(2.49)scores tests are better tests than the GLD(1.45)scores test for samples coming from distributions with smaller kurtosis than the normal distribution.
Tests generated by scores functions with lower kurtosis are better tests, for samples coming from leptokurtic distributions.On the other hand, for samples coming from distributions with lower kurtosis, the tests generated by scores functions with higher kurtosis are better.
Comparing with the sign and Wilcoxon tests, GLD scores tests are better tests, for samples coming from flattened distributions.
Within the class of tests which uses percentile score functions, the Wilcoxon, GLD(2) and uniform scores tests have the same efficiencies, and they are the best tests for samples coming from the logistic distribution.

A Proofs of theorems
A.1 Proof of Theorem 1 The expression ( 5) is not negative when (1 + u) λ4 − (1 − u) λ4 and λ 2 have the same sign.This occurs because 0 < u < 1 implies 1 + u > 1 − u and then Two cases are possible: ≥ 0 and so λ 2 must also be greater than zero.
< 0 and therefore λ 2 must also be less than zero.
A.3 Proof of Theorem 2. (Hettmansperger 1984) p. 105 uses the following expression to compute the efficacy of a rank test based on a general scores statistic with generating function φ(.): where F is the sampled distribution and f its density function.Taking u = 2F (x) − 1 implies du = 2f (x)dx and x = F −1 1+u 2 .From ( 15), we know that The score function of Vλ is: Now the density function of the sampled distribution can be written in terms of λ 2 and λ 4 as follows (see (Karian & Dudewicz 2000) Theorem 1.2.2): On the other hand, the standard deviations of the score function φ(•) and of the sampled distribution F are σ = 1 λ2 g(λ 4 ), and σ f = 1 λ 2 g(λ 4 ), respectively, where g(λ 4 ) and g(λ 4 ) are calculated from (10).
By replacing the previous results in ( 16) we obtain: Using the expression for the density function from ( 17) the following is valid: . Then (Hettmansperger 1984) gives the following alternative expression for the efficacy of a test: By replacing of ( 17) and ( 18) in ( 19) we obtain: Substituting the previous result, the variances of the score function and of the sampled distribution in expression for the efficacy in (??) we obtain: Comunicaciones en Estadística, diciembre 2017, Vol. 10, No. 2

A.4 Proof of Corollary 2.1
When the score function of − V λ and the sampled distribution are the same, then λ 4 = λ 4 ; substituting in (9) it follows that c = 1/σ f , which coincides with the efficacy of the t test.

A.5 Proof of theorem 3
The efficacies of the sign and Wilcoxon tests are: where f is the density of the sampled distribution and f * (0) = ∞ −∞ f 2 (x)dx.Approximating f (0) and f * (0) through the GLD distribution with parameters λ 2 and λ 4 produces: Similarly A.6 Proof of Theorem 4 From (9), the efficacy of a GLD scores test is a function of λ 4 , λ 4 and 1/σ f .Since the efficiency of two tests is e 12 = c 2 1 /c 2 2 , for two GLD tests this ratio is independent of σ 2 f .
A.7 Proof of Corollary 4.1 From ( 9), ( 20) and ( 21) the efficacies of the sign and the Wilcoxon tests are functions of 1/σ f and the efficacy of the t test is also a function of 1/σ f .Also the ARE of any comparison between these tests is scale invariant.

A.8 Proof of Theorem 5
The score function for Vλ is: To approximate the uniform distribution on the interval (a,b) through the GLD distribution the following parameters are used: λ 1 = (a + b)/2, λ 2 = 2/(b + a) and λ 3 = λ 4 = 1 (See (Karian & Dudewicz 2000) page 69).Replacing λ 4 in the score function we obtain: where the last equality holds because Moreover, the efficacy of a test based on a statistic with generating score function φ(.) is given by: where F is the sampled distribution and f its density.
Replacing u = 2F (x) − 1 in ( 22), taking derivative φ (u) = √ 3σ and using 1 0 φ 2 (u)du = σ 2 from (15) we obtain: which corresponds to the efficacy of the test based on − V λ with score function the percentile function of the uniform distribution on the interval (a, b).This result coincides with the efficacy of the Wilcoxon test, and for this reason they have the same efficiencies.

Table 1 :
Parameters of the GLD used as score functions.Source: own elaboration.

Table 2 :
Parameters of the GLD distributions used as sampled distributions.Source: own elaboration.

Table 6 :
Efficiencies of the Fraser (NORMAL scores) test.Source: own elaboration.

Table 7 :
Efficiencies of the Wilcoxon test.the UNIFORM.the GLD(2.0)scores test.Source: own elaboration.