## The Annals of Statistics

### Incomplete generalized L-statistics

Ola Hössjer

#### Abstract

Given data $X_1, \dots, X_n$ and a kernel h with m arguments, Serfling introduced the class of generalized L-statistics (GL-statistics), which is defined by taking linear combinations of the ordered $h(X_{i_1}, \dots, X_{i_m})$ where $(i_1, \dots, i_m)$ ranges over all $n!/(n - m)!$ distinct m-tuples of $(1, \dots, n)$. In this paper we derive a class of incomplete generalized L-statistics (IGL-statistics) by taking linear combinations of the ordered elements from a subset of ${h(X_{i_1}, \dots, X_{i_m})}$ with size $N(n)$. A special case is the class of incomplete U-statistics, introduced by Blom. Under very general conditions, the IGL-statistic is asymptotically equivalent to the GL-statistic as soon as $N(n)/n \to \infty \as n \to \infty$, which makes the IGL much more computationally feasible. We also discuss various ways of selecting the subset of ${h(X_{i_1}, \dots, X_{i_m})}$. Several examples are discussed. In particular, some new estimates of the scale parameter in nonparametric regression are introduced. It is shown that these estimates are asymptotically equivalent to an IGL-statistic. Some extensions, for example, functionals other than L and multivariate kernels, are also addressed.

#### Article information

Source
Ann. Statist., Volume 24, Number 6 (1996), 2631-2654.

Dates
First available in Project Euclid: 16 September 2002

https://projecteuclid.org/euclid.aos/1032181173

Digital Object Identifier
doi:10.1214/aos/1032181173

Mathematical Reviews number (MathSciNet)
MR1425972

Zentralblatt MATH identifier
0868.62043

#### Citation

Hössjer, Ola. Incomplete generalized L -statistics. Ann. Statist. 24 (1996), no. 6, 2631--2654. doi:10.1214/aos/1032181173. https://projecteuclid.org/euclid.aos/1032181173

#### References

• ARCONES, M. A., CHEN, Z. and GINE, E. 1993. Estimators related to U-processes with applica´ tions to multivariate medians: asy mptotic normality. Unpublished manuscript. Z.
• ARCONES, M. A. and GINE, E. 1993. Limit theorems for U-processes. Ann. Probab. 21 1494 1542. ´ Z.
• BICKEL, P. J. and LEHMANN, E. L. 1979. Descriptive statistics for nonparametric models. IV. Z. Spread. In Contributions to Statistics. Hajek Memorial Volume J. Jureckova, ed. ´ 33 40. Academia, Prague.
• BILLINGSLEY, P. 1968. Convergence of Probability Measures. Wiley, New York. Z.
• BLOM, G. 1976. Some properties of incomplete U-statistics. Biometrika 63 573 580. Z.
• BLUM, M., FLOy D, R. W., PRATT, V., RIVEST, R. and TARJAN, R. E. 1973. Time bounds for selection. J. Comput. Sy stem Sci. 7 448 461. Z.
• BROWN, B. M. and KILDEA, D. G. 1978. Reduced U-statistics and the Hodges Lehmann estimator. Ann. Statist. 6 828 835. Z.
• CHAUDHURI, P. 1992a. Multivariate location estimation using extension of R-estimates through U-statistics ty pe approach. Ann. Statist. 20 897 916. Z.
• CHAUDHURI, P. 1992b. Generalized regression quantiles: forming a useful toolkit for robust Z. linear regression. In L -Statistical Analy sis and Related Methods Y. Dodge, ed. 1 169 186. North-Holland, Amsterdam. Z.
• CHOUDHURY, J. and SERFLING, R. J. 1988. Generalized order statistics, Bahadur representations, and sequential nonparametric fixed-width confidence intervals. J. Statist. Plann. Inference 19 269 282. Z.
• COLE, R., SALOWE, J. S., STEIGER, W. L. and SZEMEREDI, E. 1989. An optimal-time algorithm for slope selection. SIAM J. Comput. 18 792 810. Z.
• CROUX, C. and ROUSSEEUW, P. J. 1992. Time-efficient algorithms for two highly robust estimaZ. tors of scale. In Computational Statistics Y. Dodge and J. Whittaker, eds. 1 411 428. physika, Heidelberg. Z.
• CROUX, C., ROUSSEEUW, P. J. and HOSSJER, O. 1994. Generalized S-estimators. J. Amer. ¨ Statist. Assoc. 89 1271 1281. Z.
• DEHEUVELS, P. 1984. Strong limit theorems for maximal spacings from a general univariate distribution. Ann. Probab. 12 1181 1193. Z.
• DILLENCOURT, M. B., MOUNT, D. M. and NETANy AHU, N. S. 1992. A randomized algorithm for slope selection. Internat. J. Comput. Geom. Appl. 2 1 27. Z.
• FREES, E. W. 1991. Trimmed slope estimates for simple linear regression. J. Statist. Plann. Inference 27 203 221. Z.
• GASSER, T., STROKA, L. and JENNER, C. 1986. Residual variance and residual pattern in nonlinear regression. Biometrika 73 625 633. Z.
• GHOSH, J. K. 1971. A new proof of the Bahadur representation of quantiles and an application. Ann. Math. Statist. 42 1957 1961. Z.
• HODGES, J. L., JR. 1967. Efficiency in normal samples and tolerance of extreme values for some estimates of location. Proc. Fifth Berkeley Sy mp. Math. Statist. Probab. 1 163 186. Univ. California Press, Berkeley. Z.
• HOEFFDING, W. 1948. A class of statistics with asy mptotically normal distribution. Ann. Math. Statist. 19 293 325. Z.
• HOEFFDING, W. 1963. Probability inequalities for sums of bounded random variables. J. Amer. Statist. Assoc. 58 13 30. Z.
• HOSSJER, O. 1997. Recursive U-quantiles. Sequential Analy sis. To appear. ¨ Z.
• HOSSJER, O., CROUX, C. and ROUSSEEUW, P. J. 1994. Asy mptotics of generalized S-estimators. ¨ J. Multivariate Anal. 51 148 177. Z.
• JANSON, S. 1984. The asy mptotic distribution of incomplete U-statistics. Z. Wahrsch. Verw. Gebiete 66 495 505. Z.
• JANSSEN, P., SERFLING, R. J. and VERAVERBEKE, N. 1984. Asy mptotic normality for a general class of statistical functions and applications to measures of spread. Ann. Statist. 12 1369 1379. Z.
• JOHNSON, D. B. and MIZOGUCHI, T. 1978. Selecting the K th element in X Y and X X 1 2 X. Siam J. Comput. 7 147 153. m Z.
• LEE, A. J. 1990. U-Statistics, Theory and Practice. Statistics, Textbooks and Monographs 110. Marcel Dekker, New York. Z.
• LIU, R. V. 1990. On a notion of data depth based on random simplices. Ann. Statist. 18 405 414. Z.
• MATOUSEK, J. 1991. Randomized optimal algorithm for slope selection. Inform. Process. Lett. 39 183 187.
• NOLAN, D. and POLLARD, D. 1988. Functional limit theorems for U-processes. Ann. Probab. 16 1291 1298. Z.
• OJA, H. 1983. Descriptive statistics for multivariate distributions. Statist. Probab. Lett. 1 327 332. Z.
• RICE, J. 1984. Bandwidth choice for nonparametric regression. Ann. Statist. 12 1215 1230. Z.
• RIEDER, H. 1991. Robust Statistics I. Asy mptotic Statistics. Univ. Bay reuth. Z.
• ROUSSEEUW, P. J. and CROUX, C. 1992. Explicit scale estimators with high breakdown point. In Z. L -Statistical Analy sis and Related Methods Y. Dodge, ed. 77 92. North-Holland, 1 Amsterdam. Z.
• ROUSSEEUW, P. J. and CROUX, C. 1993. Alternatives to the median absolute deviation. J. Amer. Statist. Assoc. 88 1273 1283. Z.
• ROUSSEEUW, P. J. and HUBERT, M. 1993. Regression-free and robust estimators of scale. Technical Report 93-23, Dept. Mathematics and Computer Science, Univ. Instelling Antwerpen. Z.
• RUy MGAART, F. H. and VAN ZUIJLEN, M. C. A. 1992. Empirical U-statistics processes. J. Statist. Plann. Inference 32 259 269. Z.
• SEN, P. K. 1968. Estimates of the regression coefficient based on Kendall's tau. J. Amer. Statist. Assoc. 63 1379 1389. Z.
• SERFLING, R. J. 1980. Approximation Theorems of Mathematical Statistics. Wiley, New York. Z.
• SERFLING, R. J. 1984. Generalized L-, Mand R-statistics. Ann. Statist. 12 76 86. Z.
• SHAMOS, M. I. 1976. Geometry and statistics: problems at the interface. In New Directions and Z. Recent Results in Algorithms and Complexity J. F. Traub, ed. 251 280. Academic Press, New York. Z.
• SILVERMAN, B. W. 1976. Limit theorems for dissociated random variables. Adv. in Appl. Probab. 8 806 819. Z.
• SILVERMAN, B. W. 1983. Convergence of a class of empirical distribution functions of dependent random variables. Ann. Probab. 11 745 751. Z.
• STROMBERG, A. J., HAWKINS, D. M. and HOSSJER, O. 1995. The least trimmed differences ¨ regression estimator and alternatives. Technical Report 1995:26, Dept. Mathematical Statistics, Lund Univ. and Lund Institute of Technology. Z.
• THEIL, H. 1950. A rank-invariant method of linear and poly nomial regression analysis, I, II and III. Koninklijke Nederlandse Akademie van Wetenschappen, Proceedings 53 386 392, 521 525, 1397 1412.